[jira] Commented: (HBASE-3196) Regionserver stuck when after all IPC Server handlers fatal'd
[ https://issues.apache.org/jira/browse/HBASE-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928138#action_12928138 ] stack commented on HBASE-3196: -- @Prakash It says 'org.apache.hadoop.ipc.RPC$VersionMismatch: Protocol org.apache.hadoop.hdfs.protocol.ClientDatanodeProtocol version mismatch. (client = 5, server = 3)' This RS is up because its waiting on all regions to close before it goes out: {code} at org.apache.hadoop.hbase.regionserver.HRegionServer.waitOnAllRegionsToClose(HRegionServer.java:645) {code} Seems like there are closer handlers waiting to do work. Does it say in the regionserver log what region is not closing? If so, can you grep it and try figure some history on the region? Thanks. > Regionserver stuck when after all IPC Server handlers fatal'd > - > > Key: HBASE-3196 > URL: https://issues.apache.org/jira/browse/HBASE-3196 > Project: HBase > Issue Type: Bug >Affects Versions: 0.90.0 >Reporter: Prakash Khemani >Assignee: Jonathan Gray > > The region server is stuck with the following jstack > 2010-11-03 22:23:41 > Full thread dump Java HotSpot(TM) 64-Bit Server VM (14.0-b16 mixed mode): > "Attach Listener" daemon prio=10 tid=0x2aaeb6774000 nid=0x3974 waiting on > condition [0x] >java.lang.Thread.State: RUNNABLE > "RS_CLOSE_REGION-pumahbase028.snc5.facebook.com,60020,1288733355197-2" > prio=10 tid=0x2aaeb8449000 nid=0x3bbc waiting on condition > [0x43f67000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x2aaab7fd1130> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925) > at > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907) > at java.lang.Thread.run(Thread.java:619) > "RS_CLOSE_REGION-pumahbase028.snc5.facebook.com,60020,1288733355197-1" > prio=10 tid=0x2aaeb843f800 nid=0x3bbb waiting on condition > [0x43e66000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x2aaab7fd1130> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925) > at > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907) > at java.lang.Thread.run(Thread.java:619) > "RS_CLOSE_REGION-pumahbase028.snc5.facebook.com,60020,1288733355197-0" > prio=10 tid=0x2aaeb8447800 nid=0x3bba waiting on condition > [0x44068000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x2aaab7fd1130> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925) > at > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907) > at java.lang.Thread.run(Thread.java:619) > "RMI Scheduler(0)" daemon prio=10 tid=0x2aaeb48c4800 nid=0x1c97 waiting > on condition [0x580a7000] >java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x2aaab773a118> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1963) > at java.util.concurrent.DelayQueue.take(DelayQueue.java:164) > at > java.util.concurrent.ScheduledThreadP
[jira] Updated: (HBASE-2819) hbck should have the ability to repair basic problems
[ https://issues.apache.org/jira/browse/HBASE-2819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-2819: - Attachment: 2819-addendum.txt Something to add to this patch -- being able to deal with empty cells in .META. HBCK should fix these up. > hbck should have the ability to repair basic problems > - > > Key: HBASE-2819 > URL: https://issues.apache.org/jira/browse/HBASE-2819 > Project: HBase > Issue Type: New Feature > Components: scripts >Reporter: Todd Lipcon >Assignee: stack >Priority: Critical > Fix For: 0.90.0 > > Attachments: 2819-addendum.txt, 2819-v10.txt, 2819-v11.txt, > 2819-v12.txt, HBASE-2819.patch > > > Right now, the hbck utility can detect issues with region deployment but > can't fix them. > It should be able to handle basic things like closing one side of a double > assignment, re-adding something to META, etc. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HBASE-3192) Test that HBase runs when a .META. row without an HRI
[ https://issues.apache.org/jira/browse/HBASE-3192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-3192. -- Resolution: Won't Fix Resolving as 'wont fix'. There are so many places in the code that presume a non-null regioninfo in .META. - - MetaScanner, MetaReader, AssignmentManager, CatalogJanitor, etc. -- that a test would be hard to write. Would need to test w/ empty HRI during master joining cluster, during bulk startup, during 'normal' operation. > Test that HBase runs when a .META. row without an HRI > - > > Key: HBASE-3192 > URL: https://issues.apache.org/jira/browse/HBASE-3192 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack > Fix For: 0.90.0 > > Attachments: 3192.txt > > > A .META. without an HRI entry should never happen but if it does, it should > not cause master shutdown (master is on a hair-trigger at mo. so that issues > are noticed quickly). HBASE-3151 fixed being able to deal w/ empty HRI. > This issue is about adding a test to verify hbase stays up (make sure chore > runs and that test does meta scanning with MetaScanner and MetaReader). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-3196) Regionserver stuck when after all IPC Server handlers fatal'd
[ https://issues.apache.org/jira/browse/HBASE-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prakash Khemani updated HBASE-3196: --- Description: The region server is stuck with the following jstack 2010-11-03 22:23:41 Full thread dump Java HotSpot(TM) 64-Bit Server VM (14.0-b16 mixed mode): "Attach Listener" daemon prio=10 tid=0x2aaeb6774000 nid=0x3974 waiting on condition [0x] java.lang.Thread.State: RUNNABLE "RS_CLOSE_REGION-pumahbase028.snc5.facebook.com,60020,1288733355197-2" prio=10 tid=0x2aaeb8449000 nid=0x3bbc waiting on condition [0x43f67000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x2aaab7fd1130> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:619) "RS_CLOSE_REGION-pumahbase028.snc5.facebook.com,60020,1288733355197-1" prio=10 tid=0x2aaeb843f800 nid=0x3bbb waiting on condition [0x43e66000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x2aaab7fd1130> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:619) "RS_CLOSE_REGION-pumahbase028.snc5.facebook.com,60020,1288733355197-0" prio=10 tid=0x2aaeb8447800 nid=0x3bba waiting on condition [0x44068000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x2aaab7fd1130> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:619) "RMI Scheduler(0)" daemon prio=10 tid=0x2aaeb48c4800 nid=0x1c97 waiting on condition [0x580a7000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x2aaab773a118> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1963) at java.util.concurrent.DelayQueue.take(DelayQueue.java:164) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:583) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:576) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:619) "RS_OPEN_REGION-pumahbase028.snc5.facebook.com,60020,1288733355197-2" daemon prio=10 tid=0x2aaeb4804800 nid=0x17a0 waiting on condition [0x582a9000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x2aaab7fca538> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358) at java.util.
[jira] Updated: (HBASE-2328) Make important configurations more obvious to new users
[ https://issues.apache.org/jira/browse/HBASE-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-2328: - Attachment: notsoquick.html I just committed an edit that adds example config. for distributed hbase and that adds requirements section from overview with some extra fill. Sill a bunch TODO. I added what page currently looks like. > Make important configurations more obvious to new users > --- > > Key: HBASE-2328 > URL: https://issues.apache.org/jira/browse/HBASE-2328 > Project: HBase > Issue Type: Improvement > Components: documentation >Reporter: Jean-Daniel Cryans > Fix For: 0.90.0 > > Attachments: notsoquick.html > > > Over the last 2 weeks, I encountered many situations where people didn't set > file descriptors and xcievers higher and that was causing a ton of problems > that are hard to debug if you're not used to them. To improve that we should: > - Refuse to start HBase if ulimit -n returns some small number smaller than > 2048, or at least print out in big red blinking letters that the current > configuration is bad and then link to a simple troubleshooting entry on the > wiki. > - Write a clearer Getting Started document where we don't give as much > explanations but add more stuff like "this is what your > hbase-site.xml/hdfs-site/xml should look like now" and give a complete file > example. At this point we don't even give a number for xcievers and we expect > new users to come up with one. > Any other low hanging fruit others can think of? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HBASE-3196) Regionserver stuck when after all IPC Server handlers fatal'
Regionserver stuck when after all IPC Server handlers fatal' Key: HBASE-3196 URL: https://issues.apache.org/jira/browse/HBASE-3196 Project: HBase Issue Type: Bug Reporter: Prakash Khemani -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3189) Stagger Major Compactions
[ https://issues.apache.org/jira/browse/HBASE-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928122#action_12928122 ] Kannan Muthukkaruppan commented on HBASE-3189: -- I think, usability wise, jitter (and it's default) should be specified as fraction (% value) of the major compaction cycle time, instead of absolute terms (like 4 hours)/ Otherwise, you have a backward compat issue with this change for someone who is running a major compaction say every three hours, but has forgotten to set the jitter parameter when they upgrade to 0.90. And they'll be compacting anywhere from 3hrs +/- (2* 4 hours jitter default). This approach will also ensure you don't return -ve values for "get next compaction time". > Stagger Major Compactions > - > > Key: HBASE-3189 > URL: https://issues.apache.org/jira/browse/HBASE-3189 > Project: HBase > Issue Type: Bug > Components: regionserver >Reporter: Nicolas Spiegelberg >Assignee: Nicolas Spiegelberg >Priority: Minor > Fix For: 0.90.0 > > Attachments: HBASE-3189.patch > > > For pre-split regions, we can get into a case where the oldest HFile in a > Store is pretty large and will not encounter a compaction within the 24hr > major compact window. If that's the case, we don't want multiple multi-GB > major compactions being triggered at the same time. Add ability to stagger > the major compaction expiration window. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3189) Stagger Major Compactions
[ https://issues.apache.org/jira/browse/HBASE-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928123#action_12928123 ] Kannan Muthukkaruppan commented on HBASE-3189: -- I think, usability wise, jitter (and it's default) should be specified as fraction (% value) of the major compaction cycle time, instead of absolute terms (like 4 hours)/ Otherwise, you have a backward compat issue with this change for someone who is running a major compaction say every three hours, but has forgotten to set the jitter parameter when they upgrade to 0.90. And they'll be compacting anywhere from 3hrs +/- (2* 4 hours jitter default). This approach will also ensure you don't return -ve values for "get next compaction time". > Stagger Major Compactions > - > > Key: HBASE-3189 > URL: https://issues.apache.org/jira/browse/HBASE-3189 > Project: HBase > Issue Type: Bug > Components: regionserver >Reporter: Nicolas Spiegelberg >Assignee: Nicolas Spiegelberg >Priority: Minor > Fix For: 0.90.0 > > Attachments: HBASE-3189.patch > > > For pre-split regions, we can get into a case where the oldest HFile in a > Store is pretty large and will not encounter a compaction within the 24hr > major compact window. If that's the case, we don't want multiple multi-GB > major compactions being triggered at the same time. Add ability to stagger > the major compaction expiration window. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3168) Sanity date and time check when a region server joins the cluster
[ https://issues.apache.org/jira/browse/HBASE-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928117#action_12928117 ] stack commented on HBASE-3168: -- @Jeff #1 could be a legitimate problem in case where regionserver came up but there was no master to connect too so regionserver just hung out twiddling its thumbs for five or ten minutes. #2 is not an issue. You say "If each region server then calls reportsForDuty...". Thats not what happens. A regionserver when it comes up calls reportForDuty/regionServerStartup. Thereafter, it heartbeats by calling regionServerReport (until it dies). When a master joins an already running cluster, the regionservers will just call the new masters' regionServerReport - not the initializing regionServerStartup -- and the master just registers the regionserver at that time (TODO: do away with regionServerStartup or when a new master joins cluster, have regionserver call regionServerStartup rather than regionServerReport. In interests of simplicity, it doesn't seem as though regionServerStartup is no longer necessary so we should just axe it). I like Jon's suggestion of changing the signature on reportsForDuty to add regionServerCurrentTimeMillis param. You might argue that regionServerReport should be modified too to also take the regionserver timestamp but thats probably overdoing it. Thanks for working on this. > Sanity date and time check when a region server joins the cluster > - > > Key: HBASE-3168 > URL: https://issues.apache.org/jira/browse/HBASE-3168 > Project: HBase > Issue Type: Improvement > Components: regionserver >Affects Versions: 0.89.20100924 > Environment: RHEL 5.5 64bit, 1 Master 4 Region Servers >Reporter: Jeff Whiting > Fix For: 0.90.0 > > Attachments: HBASE-3168-trunk-v1.txt > > > Introduce a sanity check when a RS joins the cluster to make sure its clock > isn't too far out of skew with the rest of the cluster. If the RS's time is > too far out of skew then the master would prevent it from joining and RS > would die and log the error. > Having a RS with even small differences in time can cause huge problems due > to how bhase stores values with timestamps. > According to J-D in ServerManager we are already doing: > {code} > HServerInfo info = new HServerInfo(serverInfo); > checkIsDead(info.getServerName(), "STARTUP"); > checkAlreadySameHostPort(info); > recordNewServer(info, false, null); > {code} > And that the new check would fit in nicely there. > JG suggests we add a "ClockOutOfSync-like exception" -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HBASE-3193) Regression: HBASE_MANAGES_ZK=false broken
[ https://issues.apache.org/jira/browse/HBASE-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-3193. -- Resolution: Invalid I just tried this. I set the flag to false in hbase-env.sh. I started hbase. It failed to start because no zk. I then shut it all down. I then started a zk instance and then started the cluster again. This time it launched. Seems like this is not an issue. Closing for now as invalid till get more info (Charles)? > Regression: HBASE_MANAGES_ZK=false broken > - > > Key: HBASE-3193 > URL: https://issues.apache.org/jira/browse/HBASE-3193 > Project: HBase > Issue Type: Bug >Reporter: stack > Fix For: 0.90.0 > > > From Charles Thayer up on the list: > {code} > I haven't seen any replies, which is probably because the master seems to > be changing rapidly at the moment. However, if anyone needs this for > hbase 0.89.20100726, here's a patch to work around the issue temporarily > until 0.90.0 (which will probably fix the problem). > /charles thayer > --- src/main/java/org/apache/hadoop/hbase/master/HMaster.java 2010-07-30 > 21:09:11.0 + > +++ src/main/java/org/apache/hadoop/hbase/master/HMaster.java 2010-10-11 > 20:51:30.821519000 + > @@ -1297,11 +1297,18 @@ > runtime.getVmVendor() + ", vmVersion=" + > runtime.getVmVersion()); > LOG.info("vmInputArguments=" + runtime.getInputArguments()); > } > + > + boolean hbase_manages_zk = true; > + if (System.getenv("HBASE_MANAGES_ZK") != null > + && System.getenv("HBASE_MANAGES_ZK").equals("false")) > + hbase_manages_zk = false; > + > // If 'local', defer to LocalHBaseCluster instance. Starts master > // and regionserver both in the one JVM. > if (LocalHBaseCluster.isLocal(conf)) { > final MiniZooKeeperCluster zooKeeperCluster = > new MiniZooKeeperCluster(); > + if (hbase_manages_zk) { // thayer > File zkDataPath = new > File(conf.get("hbase.zookeeper.property.dataDir")); > int zkClientPort = > conf.getInt("hbase.zookeeper.property.clientPort", 0); > if (zkClientPort == 0) { > @@ -1319,11 +1326,15 @@ > } > conf.set("hbase.zookeeper.property.clientPort", > Integer.toString(clientPort)); > + } // thayer > + > // Need to have the zk cluster shutdown when master is shutdown. > // Run a subclass that does the zk cluster shutdown on its way > out. > LocalHBaseCluster cluster = new LocalHBaseCluster(conf, 1, > LocalHMaster.class, HRegionServer.class); > + if (hbase_manages_zk) { > > ((LocalHMaster)cluster.getMaster()).setZKCluster(zooKeeperCluster); > + } > cluster.startup(); > } else { > HMaster master = constructMaster(masterClass, conf); > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HBASE-3195) Fix the new TestTransform breakage up on hudson
Fix the new TestTransform breakage up on hudson --- Key: HBASE-3195 URL: https://issues.apache.org/jira/browse/HBASE-3195 Project: HBase Issue Type: Bug Reporter: stack This new test has been failing up on hudson since it was introduce at #1606. I took a look. It looks reasonable but its failing in an odd way -- can't find blocks in hdfs. I'm moving it aside for now till test gets some loving. Breakage lasted till at least #1613. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3194) HBase should run on both secure and vanilla versions of Hadoop 0.20
[ https://issues.apache.org/jira/browse/HBASE-3194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928074#action_12928074 ] Gary Helmling commented on HBASE-3194: -- Using reflection for isolation should work fine and should allow running against both versions without rebuilding. I'm working it out now. The easy part is getting the current UGI. The harder part is "setting" the current UGI (only needed by MiniHBaseCluster and test code at the moment), since secure Hadoop changed this to UGI.doAs() with a PrivilegedAction instance wrapping the actual execution. I'll sort out an initial attempt at isolating that and we can discuss the general approach. > HBase should run on both secure and vanilla versions of Hadoop 0.20 > --- > > Key: HBASE-3194 > URL: https://issues.apache.org/jira/browse/HBASE-3194 > Project: HBase > Issue Type: Bug >Reporter: Gary Helmling > > There have been a couple cases recently of folks trying to run HBase trunk > (or 0.89 DRs) on CDH3b3 or secure Hadoop.While HBase security is in the > works, it currently only runs on secure Hadoop versions. Meanwhile HBase > trunk won't compile on secure Hadoop due to backward incompatible changes in > org.apache.hadoop.security.UserGroupInformation. > This issue is to work out the minimal set of changes necessary to allow HBase > to build and run on both secure and non-secure versions of Hadoop. Though, > with secure Hadoop, I don't even think it's important to target running with > HDFS security enabled (and krb authentication). Just allow HBase to build > and run in both versions. > I think mainly this amounts to abstracting usage of UserGroupInformation and > UnixUserGroupInformation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3194) HBase should run on both secure and vanilla versions of Hadoop 0.20
[ https://issues.apache.org/jira/browse/HBASE-3194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928072#action_12928072 ] Andrew Purtell commented on HBASE-3194: --- It should be possible to wrap UGI and UUGI with something that uses reflection to determine what platform variant is below. Anyone forsee a problem with that approach? > HBase should run on both secure and vanilla versions of Hadoop 0.20 > --- > > Key: HBASE-3194 > URL: https://issues.apache.org/jira/browse/HBASE-3194 > Project: HBase > Issue Type: Bug >Reporter: Gary Helmling > > There have been a couple cases recently of folks trying to run HBase trunk > (or 0.89 DRs) on CDH3b3 or secure Hadoop.While HBase security is in the > works, it currently only runs on secure Hadoop versions. Meanwhile HBase > trunk won't compile on secure Hadoop due to backward incompatible changes in > org.apache.hadoop.security.UserGroupInformation. > This issue is to work out the minimal set of changes necessary to allow HBase > to build and run on both secure and non-secure versions of Hadoop. Though, > with secure Hadoop, I don't even think it's important to target running with > HDFS security enabled (and krb authentication). Just allow HBase to build > and run in both versions. > I think mainly this amounts to abstracting usage of UserGroupInformation and > UnixUserGroupInformation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-2819) hbck should have the ability to repair basic problems
[ https://issues.apache.org/jira/browse/HBASE-2819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-2819: - Attachment: 2819-v12.txt > hbck should have the ability to repair basic problems > - > > Key: HBASE-2819 > URL: https://issues.apache.org/jira/browse/HBASE-2819 > Project: HBase > Issue Type: New Feature > Components: scripts >Reporter: Todd Lipcon >Assignee: stack >Priority: Critical > Fix For: 0.90.0 > > Attachments: 2819-v10.txt, 2819-v11.txt, 2819-v12.txt, > HBASE-2819.patch > > > Right now, the hbck utility can detect issues with region deployment but > can't fix them. > It should be able to handle basic things like closing one side of a double > assignment, re-adding something to META, etc. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HBASE-2471) Splitting logs, we'll make an output file though the region no longer exists
[ https://issues.apache.org/jira/browse/HBASE-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Daniel Cryans resolved HBASE-2471. --- Resolution: Fixed Hadoop Flags: [Reviewed] Committed patch to trunk. It expect some sort of destabilization as some tests looked flaky when I ran the full suite, so I might have to fix more tests in the near future. > Splitting logs, we'll make an output file though the region no longer exists > > > Key: HBASE-2471 > URL: https://issues.apache.org/jira/browse/HBASE-2471 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: Jean-Daniel Cryans > Fix For: 0.90.0 > > Attachments: HBASE-2471-v2.patch > > > The "human unit tester" (Kannan) last night wondered what happens splitting > logs and we come across an edit whose region has since been removed. Taking > a look, it looks like we'll create the output file and write the edits for > the no-longer-extant region anyways. This will leave litter in the > filesystem -- region split files that will never be used nor removed. This > issue is about verifying that indeed this is whats happening (We do > SequenceFile.createWriter with the overwrite flag set to true which tracing > seems to mean create all intermediary directories -- to be verified) and if > it indeed is happening, fixing split so unless the region dir exists, don't > write out edits.. just drop them. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-2471) Splitting logs, we'll make an output file though the region no longer exists
[ https://issues.apache.org/jira/browse/HBASE-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Daniel Cryans updated HBASE-2471: -- Attachment: HBASE-2471-v2.patch Patch that I'm about to commit. It's different from what I posted on RB because some other unit tests needed to be changed and only figured it out when running the full test suite. > Splitting logs, we'll make an output file though the region no longer exists > > > Key: HBASE-2471 > URL: https://issues.apache.org/jira/browse/HBASE-2471 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: Jean-Daniel Cryans > Fix For: 0.90.0 > > Attachments: HBASE-2471-v2.patch > > > The "human unit tester" (Kannan) last night wondered what happens splitting > logs and we come across an edit whose region has since been removed. Taking > a look, it looks like we'll create the output file and write the edits for > the no-longer-extant region anyways. This will leave litter in the > filesystem -- region split files that will never be used nor removed. This > issue is about verifying that indeed this is whats happening (We do > SequenceFile.createWriter with the overwrite flag set to true which tracing > seems to mean create all intermediary directories -- to be verified) and if > it indeed is happening, fixing split so unless the region dir exists, don't > write out edits.. just drop them. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3194) HBase should run on both secure and vanilla versions of Hadoop 0.20
[ https://issues.apache.org/jira/browse/HBASE-3194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928062#action_12928062 ] ryan rawson commented on HBASE-3194: it would also be nice to run on both w/o rebuilding. > HBase should run on both secure and vanilla versions of Hadoop 0.20 > --- > > Key: HBASE-3194 > URL: https://issues.apache.org/jira/browse/HBASE-3194 > Project: HBase > Issue Type: Bug >Reporter: Gary Helmling > > There have been a couple cases recently of folks trying to run HBase trunk > (or 0.89 DRs) on CDH3b3 or secure Hadoop.While HBase security is in the > works, it currently only runs on secure Hadoop versions. Meanwhile HBase > trunk won't compile on secure Hadoop due to backward incompatible changes in > org.apache.hadoop.security.UserGroupInformation. > This issue is to work out the minimal set of changes necessary to allow HBase > to build and run on both secure and non-secure versions of Hadoop. Though, > with secure Hadoop, I don't even think it's important to target running with > HDFS security enabled (and krb authentication). Just allow HBase to build > and run in both versions. > I think mainly this amounts to abstracting usage of UserGroupInformation and > UnixUserGroupInformation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HBASE-3194) HBase should run on both secure and vanilla versions of Hadoop 0.20
HBase should run on both secure and vanilla versions of Hadoop 0.20 --- Key: HBASE-3194 URL: https://issues.apache.org/jira/browse/HBASE-3194 Project: HBase Issue Type: Bug Reporter: Gary Helmling There have been a couple cases recently of folks trying to run HBase trunk (or 0.89 DRs) on CDH3b3 or secure Hadoop.While HBase security is in the works, it currently only runs on secure Hadoop versions. Meanwhile HBase trunk won't compile on secure Hadoop due to backward incompatible changes in org.apache.hadoop.security.UserGroupInformation. This issue is to work out the minimal set of changes necessary to allow HBase to build and run on both secure and non-secure versions of Hadoop. Though, with secure Hadoop, I don't even think it's important to target running with HDFS security enabled (and krb authentication). Just allow HBase to build and run in both versions. I think mainly this amounts to abstracting usage of UserGroupInformation and UnixUserGroupInformation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-3192) Test that HBase runs when a .META. row without an HRI
[ https://issues.apache.org/jira/browse/HBASE-3192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-3192: - Attachment: 3192.txt Add in this too when I make test... this makes HBaseAdmin immune to odd .META. rows. > Test that HBase runs when a .META. row without an HRI > - > > Key: HBASE-3192 > URL: https://issues.apache.org/jira/browse/HBASE-3192 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: stack > Fix For: 0.90.0 > > Attachments: 3192.txt > > > A .META. without an HRI entry should never happen but if it does, it should > not cause master shutdown (master is on a hair-trigger at mo. so that issues > are noticed quickly). HBASE-3151 fixed being able to deal w/ empty HRI. > This issue is about adding a test to verify hbase stays up (make sure chore > runs and that test does meta scanning with MetaScanner and MetaReader). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2828) HTable unnecessarily coupled with HMaster
[ https://issues.apache.org/jira/browse/HBASE-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928051#action_12928051 ] stack commented on HBASE-2828: -- done > HTable unnecessarily coupled with HMaster > - > > Key: HBASE-2828 > URL: https://issues.apache.org/jira/browse/HBASE-2828 > Project: HBase > Issue Type: Bug > Components: client >Affects Versions: 0.90.0 >Reporter: Nicolas Spiegelberg >Assignee: Nicolas Spiegelberg > Fix For: 0.90.0 > > Attachments: HBASE-2828-0.90.patch, HBASE-2828.a.patch, > HBASE-2828.patch > > > HTable constructor calls "getCurrentNrHRS()" to get the region server count > for thread pool creation. This code calls HBaseAdmin.getClusterStatus() > [aka: the HMaster] to get the server count. This information can be scraped > from counting the ZooKeeper /hbase/rs/--- ZNodes. Need to remove unnecessary > master queries when ZooKeeper can do the same job. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HBASE-3193) Regression: HBASE_MANAGES_ZK=false broken
Regression: HBASE_MANAGES_ZK=false broken - Key: HBASE-3193 URL: https://issues.apache.org/jira/browse/HBASE-3193 Project: HBase Issue Type: Bug Reporter: stack Fix For: 0.90.0 >From Charles Thayer up on the list: {code} I haven't seen any replies, which is probably because the master seems to be changing rapidly at the moment. However, if anyone needs this for hbase 0.89.20100726, here's a patch to work around the issue temporarily until 0.90.0 (which will probably fix the problem). /charles thayer --- src/main/java/org/apache/hadoop/hbase/master/HMaster.java 2010-07-30 21:09:11.0 + +++ src/main/java/org/apache/hadoop/hbase/master/HMaster.java 2010-10-11 20:51:30.821519000 + @@ -1297,11 +1297,18 @@ runtime.getVmVendor() + ", vmVersion=" + runtime.getVmVersion()); LOG.info("vmInputArguments=" + runtime.getInputArguments()); } + + boolean hbase_manages_zk = true; + if (System.getenv("HBASE_MANAGES_ZK") != null + && System.getenv("HBASE_MANAGES_ZK").equals("false")) + hbase_manages_zk = false; + // If 'local', defer to LocalHBaseCluster instance. Starts master // and regionserver both in the one JVM. if (LocalHBaseCluster.isLocal(conf)) { final MiniZooKeeperCluster zooKeeperCluster = new MiniZooKeeperCluster(); + if (hbase_manages_zk) { // thayer File zkDataPath = new File(conf.get("hbase.zookeeper.property.dataDir")); int zkClientPort = conf.getInt("hbase.zookeeper.property.clientPort", 0); if (zkClientPort == 0) { @@ -1319,11 +1326,15 @@ } conf.set("hbase.zookeeper.property.clientPort", Integer.toString(clientPort)); + } // thayer + // Need to have the zk cluster shutdown when master is shutdown. // Run a subclass that does the zk cluster shutdown on its way out. LocalHBaseCluster cluster = new LocalHBaseCluster(conf, 1, LocalHMaster.class, HRegionServer.class); + if (hbase_manages_zk) { ((LocalHMaster)cluster.getMaster()).setZKCluster(zooKeeperCluster); + } cluster.startup(); } else { HMaster master = constructMaster(masterClass, conf); {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2828) HTable unnecessarily coupled with HMaster
[ https://issues.apache.org/jira/browse/HBASE-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928012#action_12928012 ] Nicolas Spiegelberg commented on HBASE-2828: can we add some explicit comment in there about purposefully not going to Master for the RS count. I wouldn't want a 3rd occurrence of this... > HTable unnecessarily coupled with HMaster > - > > Key: HBASE-2828 > URL: https://issues.apache.org/jira/browse/HBASE-2828 > Project: HBase > Issue Type: Bug > Components: client >Affects Versions: 0.90.0 >Reporter: Nicolas Spiegelberg >Assignee: Nicolas Spiegelberg > Fix For: 0.90.0 > > Attachments: HBASE-2828-0.90.patch, HBASE-2828.a.patch, > HBASE-2828.patch > > > HTable constructor calls "getCurrentNrHRS()" to get the region server count > for thread pool creation. This code calls HBaseAdmin.getClusterStatus() > [aka: the HMaster] to get the server count. This information can be scraped > from counting the ZooKeeper /hbase/rs/--- ZNodes. Need to remove unnecessary > master queries when ZooKeeper can do the same job. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HBASE-3192) Test that HBase runs when a .META. row without an HRI
Test that HBase runs when a .META. row without an HRI - Key: HBASE-3192 URL: https://issues.apache.org/jira/browse/HBASE-3192 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Fix For: 0.90.0 A .META. without an HRI entry should never happen but if it does, it should not cause master shutdown (master is on a hair-trigger at mo. so that issues are noticed quickly). HBASE-3151 fixed being able to deal w/ empty HRI. This issue is about adding a test to verify hbase stays up (make sure chore runs and that test does meta scanning with MetaScanner and MetaReader). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3151) NPE when trying to read regioninfo from .META.
[ https://issues.apache.org/jira/browse/HBASE-3151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928010#action_12928010 ] stack commented on HBASE-3151: -- Well, I think I can add test for case of a .META. row that has all but HRI. That'd be good for testing we don't crap out as we were doing. Let me make a new issue to do that. > NPE when trying to read regioninfo from .META. > -- > > Key: HBASE-3151 > URL: https://issues.apache.org/jira/browse/HBASE-3151 > Project: HBase > Issue Type: Bug >Affects Versions: 0.90.0 >Reporter: stack >Assignee: stack > Fix For: 0.90.0 > > Attachments: offline.txt > > > This is an old issue perhaps in a new guise. From the list, Sebastien Bauer > reports: > {code} > > 2010-10-25 08:13:01,690 ERROR > > org.apache.hadoop.hbase.master.CatalogJanitor: Caught exception > > java.lang.NullPointerException > > 2010-10-25 08:13:24,385 INFO > > org.apache.hadoop.hbase.master.ServerManager: regionservers=2, > > averageload=2538 > > 2010-10-23 20:16:17,890 DEBUG > > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: > > Cached location for .META.,,1.1028785192 is > > db2a.goldenline.pl:60020 > > 2010-10-23 20:16:18,432 FATAL org.apache.hadoop.hbase.master.HMaster: > > Unhandled exception. Starting > > shutdown. > > > > java.lang.NullPointerException > > > > at > > org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:75) > > > > at > > org.apache.hadoop.hbase.util.Writables.getHRegionInfo(Writables.java:119) > > > > at > > org.apache.hadoop.hbase.client.MetaScanner$1.processRow(MetaScanner.java:188) > > > > at > > org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:157) > > > > at > > org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:69) > > > > at > > org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:54) > > > > at > > org.apache.hadoop.hbase.client.MetaScanner.listAllRegions(MetaScanner.java:195) > > > > at > > org.apache.hadoop.hbase.master.AssignmentManager.assignAllUserRegions(AssignmentManager.java:1048) > > > > at > > org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:379) > > > > at > > org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:265) > > > > 2010-10-23 20:16:18,433 INFO org.apache.hadoop.hbase.master.HMaster: > > Aborting > > > > 2010-10-23 20:16:18,433 DEBUG org.apache.hadoop.hbase.master.HMaster: > > Stopping service threads > {code} > I think he has an old master... checking. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2770) Major compactions from shell may not major compact all families
[ https://issues.apache.org/jira/browse/HBASE-2770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928009#action_12928009 ] stack commented on HBASE-2770: -- Will I close this Dave? > Major compactions from shell may not major compact all families > --- > > Key: HBASE-2770 > URL: https://issues.apache.org/jira/browse/HBASE-2770 > Project: HBase > Issue Type: Bug >Affects Versions: 0.20.4 >Reporter: Dave Latham >Priority: Critical > Fix For: 0.92.0 > > > As part of a data center migration, I initiated a major_compaction request on > all tables from the shell. A few hours later, all the region servers in the > cluster appeared to have completed the compactions and all compactionQueue > metrics were back to 0. However, some column families of some regions had > not actually done a major compaction. > Digging through logs and code, it looks like the following happened. The > shell makes a major compaction request which sets > HRegion.forceMajorCompaction to true for every region. Periodically, the > HRegionServer.MajorCompactionChecker checks to see if a major compaction is > needed in any family's store. If so, calls > CompactSplitThread.compactionRequested which ends up setting the region > forceMajorCompaction to false, even if it is already in the compaction queue > and set to true. Then, when that region comes off the queue to be compacted, > each family/store separately checks for whether it should do a major > compaction, so some families may not do so. > (This is not good if, for example, you're doing a DistCp of the hbase dir and > later on the cluster decides to do a compaction on those files and deletes > ones the DistCp job is looking for, causing it to fail.) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-2828) HTable unnecessarily coupled with HMaster
[ https://issues.apache.org/jira/browse/HBASE-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-2828: - Resolution: Fixed Status: Resolved (was: Patch Available) I committed your patch Jon. > HTable unnecessarily coupled with HMaster > - > > Key: HBASE-2828 > URL: https://issues.apache.org/jira/browse/HBASE-2828 > Project: HBase > Issue Type: Bug > Components: client >Affects Versions: 0.90.0 >Reporter: Nicolas Spiegelberg >Assignee: Nicolas Spiegelberg > Fix For: 0.90.0 > > Attachments: HBASE-2828-0.90.patch, HBASE-2828.a.patch, > HBASE-2828.patch > > > HTable constructor calls "getCurrentNrHRS()" to get the region server count > for thread pool creation. This code calls HBaseAdmin.getClusterStatus() > [aka: the HMaster] to get the server count. This information can be scraped > from counting the ZooKeeper /hbase/rs/--- ZNodes. Need to remove unnecessary > master queries when ZooKeeper can do the same job. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HBASE-3191) FilterList with MUST_PASS_ONE and SCVF isn't working
[ https://issues.apache.org/jira/browse/HBASE-3191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-3191. -- Resolution: Fixed Fix Version/s: 0.90.0 Hadoop Flags: [Reviewed] Committed to TRUNK. Thank you for the patch Stefan. > FilterList with MUST_PASS_ONE and SCVF isn't working > > > Key: HBASE-3191 > URL: https://issues.apache.org/jira/browse/HBASE-3191 > Project: HBase > Issue Type: Bug > Components: filters >Affects Versions: 0.89.20100924, 0.90.0 >Reporter: Stefan Seelmann >Priority: Minor > Fix For: 0.90.0 > > Attachments: HBASE-3191.patch > > > In a special case the FilterList with MUST_PASS_ONE operator doesn't work > correctly: > - a filter in the list is a SingleColumValueFilter with filterIfMissing=true > - FilterList.filterKeyValue(KeyValue) is called > - SingleColumValueFilter.filterKeyValue(KeyValue) is called > - SingleColumValueFilter.filterKeyValue(KeyValue) returns ReturnCode.INCLUDE > if the KeyValue doesn't match a column (to support filterIfMissing) > - FilterList.filterKeyValue(KeyValue) immediately returns ReturnCode.INCLUDE, > remaining filters in the list aren't evaluated. > However it is required to evaluate remaining filters, otherwise filterRow() > filters out rows in case the filter's filterKeyValue() saves state that is > used by filterRow(). (SingleColumValueFilter, SkipFilter, WhileMatchFilter do > so) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-3191) FilterList with MUST_PASS_ONE and SCVF isn't working
[ https://issues.apache.org/jira/browse/HBASE-3191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Seelmann updated HBASE-3191: --- Attachment: HBASE-3191.patch Patch with Test > FilterList with MUST_PASS_ONE and SCVF isn't working > > > Key: HBASE-3191 > URL: https://issues.apache.org/jira/browse/HBASE-3191 > Project: HBase > Issue Type: Bug > Components: filters >Affects Versions: 0.89.20100924, 0.90.0 >Reporter: Stefan Seelmann >Priority: Minor > Attachments: HBASE-3191.patch > > > In a special case the FilterList with MUST_PASS_ONE operator doesn't work > correctly: > - a filter in the list is a SingleColumValueFilter with filterIfMissing=true > - FilterList.filterKeyValue(KeyValue) is called > - SingleColumValueFilter.filterKeyValue(KeyValue) is called > - SingleColumValueFilter.filterKeyValue(KeyValue) returns ReturnCode.INCLUDE > if the KeyValue doesn't match a column (to support filterIfMissing) > - FilterList.filterKeyValue(KeyValue) immediately returns ReturnCode.INCLUDE, > remaining filters in the list aren't evaluated. > However it is required to evaluate remaining filters, otherwise filterRow() > filters out rows in case the filter's filterKeyValue() saves state that is > used by filterRow(). (SingleColumValueFilter, SkipFilter, WhileMatchFilter do > so) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HBASE-3191) FilterList with MUST_PASS_ONE and SCVF isn't working
FilterList with MUST_PASS_ONE and SCVF isn't working Key: HBASE-3191 URL: https://issues.apache.org/jira/browse/HBASE-3191 Project: HBase Issue Type: Bug Components: filters Affects Versions: 0.89.20100924, 0.90.0 Reporter: Stefan Seelmann Priority: Minor In a special case the FilterList with MUST_PASS_ONE operator doesn't work correctly: - a filter in the list is a SingleColumValueFilter with filterIfMissing=true - FilterList.filterKeyValue(KeyValue) is called - SingleColumValueFilter.filterKeyValue(KeyValue) is called - SingleColumValueFilter.filterKeyValue(KeyValue) returns ReturnCode.INCLUDE if the KeyValue doesn't match a column (to support filterIfMissing) - FilterList.filterKeyValue(KeyValue) immediately returns ReturnCode.INCLUDE, remaining filters in the list aren't evaluated. However it is required to evaluate remaining filters, otherwise filterRow() filters out rows in case the filter's filterKeyValue() saves state that is used by filterRow(). (SingleColumValueFilter, SkipFilter, WhileMatchFilter do so) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-2828) HTable unnecessarily coupled with HMaster
[ https://issues.apache.org/jira/browse/HBASE-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Gray updated HBASE-2828: - Status: Patch Available (was: Reopened) > HTable unnecessarily coupled with HMaster > - > > Key: HBASE-2828 > URL: https://issues.apache.org/jira/browse/HBASE-2828 > Project: HBase > Issue Type: Bug > Components: client >Affects Versions: 0.90.0 >Reporter: Nicolas Spiegelberg >Assignee: Nicolas Spiegelberg > Fix For: 0.90.0 > > Attachments: HBASE-2828-0.90.patch, HBASE-2828.a.patch, > HBASE-2828.patch > > > HTable constructor calls "getCurrentNrHRS()" to get the region server count > for thread pool creation. This code calls HBaseAdmin.getClusterStatus() > [aka: the HMaster] to get the server count. This information can be scraped > from counting the ZooKeeper /hbase/rs/--- ZNodes. Need to remove unnecessary > master queries when ZooKeeper can do the same job. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3160) Compactions: Use more intelligent priorities for PriorityCompactionQueue
[ https://issues.apache.org/jira/browse/HBASE-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12927969#action_12927969 ] Nicolas Spiegelberg commented on HBASE-3160: @Jeff: I forgot yesterday, but I also did fix an issue with the compaction priorities. If a flush happened for the store that was being compacted, it would be added to the compaction queue at the pre-compact priority instead of post-compact. We now do: {code} if (!this.server.isStopRequested()) { // requests that were added during compaction will have a // stale priority. remove and re-insert to update priority boolean hadCompaction = compactionQueue.remove(r); if (midKey != null) { split(r, midKey); } else if (hadCompaction) { compactionQueue.add(r); } } {code} > Compactions: Use more intelligent priorities for PriorityCompactionQueue > > > Key: HBASE-3160 > URL: https://issues.apache.org/jira/browse/HBASE-3160 > Project: HBase > Issue Type: Improvement >Affects Versions: 0.89.20100924, 0.90.0 >Reporter: Nicolas Spiegelberg >Assignee: Nicolas Spiegelberg > Fix For: 0.90.0 > > Attachments: HBASE-3160.patch > > > One of the problems with the current compaction queue is that we have a very > low granularity on the importance of the various compactions in the queue. > If a StoreFile count exceeds 15 files, only then do we bump via enum change. > We should instead look into more intelligent, granular priority metrics for > choosing the next compaction. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-2828) HTable unnecessarily coupled with HMaster
[ https://issues.apache.org/jira/browse/HBASE-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Gray updated HBASE-2828: - Attachment: HBASE-2828-0.90.patch Looks like the documentation changes made it, just not the HTable.getCurrentNrHRS(). Patch changes implementation of that method (same principle as original patch but slightly different0. > HTable unnecessarily coupled with HMaster > - > > Key: HBASE-2828 > URL: https://issues.apache.org/jira/browse/HBASE-2828 > Project: HBase > Issue Type: Bug > Components: client >Affects Versions: 0.90.0 >Reporter: Nicolas Spiegelberg >Assignee: Nicolas Spiegelberg > Fix For: 0.90.0 > > Attachments: HBASE-2828-0.90.patch, HBASE-2828.a.patch, > HBASE-2828.patch > > > HTable constructor calls "getCurrentNrHRS()" to get the region server count > for thread pool creation. This code calls HBaseAdmin.getClusterStatus() > [aka: the HMaster] to get the server count. This information can be scraped > from counting the ZooKeeper /hbase/rs/--- ZNodes. Need to remove unnecessary > master queries when ZooKeeper can do the same job. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Reopened: (HBASE-2828) HTable unnecessarily coupled with HMaster
[ https://issues.apache.org/jira/browse/HBASE-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Gray reopened HBASE-2828: -- This is in some 0.89 branches but was left out from 0.90 (I guess during master rewrite commit). > HTable unnecessarily coupled with HMaster > - > > Key: HBASE-2828 > URL: https://issues.apache.org/jira/browse/HBASE-2828 > Project: HBase > Issue Type: Bug > Components: client >Affects Versions: 0.90.0 >Reporter: Nicolas Spiegelberg >Assignee: Nicolas Spiegelberg > Fix For: 0.90.0 > > Attachments: HBASE-2828.a.patch, HBASE-2828.patch > > > HTable constructor calls "getCurrentNrHRS()" to get the region server count > for thread pool creation. This code calls HBaseAdmin.getClusterStatus() > [aka: the HMaster] to get the server count. This information can be scraped > from counting the ZooKeeper /hbase/rs/--- ZNodes. Need to remove unnecessary > master queries when ZooKeeper can do the same job. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HBASE-2445) Clean up client retry policies
[ https://issues.apache.org/jira/browse/HBASE-2445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Latham updated HBASE-2445: --- Fix Version/s: 0.92.0 Here's hoping it can get picked up for 0.92 > Clean up client retry policies > -- > > Key: HBASE-2445 > URL: https://issues.apache.org/jira/browse/HBASE-2445 > Project: HBase > Issue Type: Improvement > Components: client >Reporter: Todd Lipcon >Priority: Critical > Fix For: 0.92.0 > > > Right now almost all retry behavior is governed by a single parameter that > determines the number of retries. In a few places, there are also conf for > the number of millis to sleep between retries. This isn't quite flexible > enough. If we can refactor some of the retry logic into a RetryPolicy class, > we could introduce exponential backoff where appropriate, clean up some of > the config, etc -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3160) Compactions: Use more intelligent priorities for PriorityCompactionQueue
[ https://issues.apache.org/jira/browse/HBASE-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12927887#action_12927887 ] Dave Latham commented on HBASE-3160: Does that mean https://issues.apache.org/jira/browse/HBASE-2770 is fixed then? > Compactions: Use more intelligent priorities for PriorityCompactionQueue > > > Key: HBASE-3160 > URL: https://issues.apache.org/jira/browse/HBASE-3160 > Project: HBase > Issue Type: Improvement >Affects Versions: 0.89.20100924, 0.90.0 >Reporter: Nicolas Spiegelberg >Assignee: Nicolas Spiegelberg > Fix For: 0.90.0 > > Attachments: HBASE-3160.patch > > > One of the problems with the current compaction queue is that we have a very > low granularity on the importance of the various compactions in the queue. > If a StoreFile count exceeds 15 files, only then do we bump via enum change. > We should instead look into more intelligent, granular priority metrics for > choosing the next compaction. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3190) Problem with disabling and droping table
[ https://issues.apache.org/jira/browse/HBASE-3190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12927778#action_12927778 ] Sebastian Bauer commented on HBASE-3190: im using revision 1030348 > Problem with disabling and droping table > > > Key: HBASE-3190 > URL: https://issues.apache.org/jira/browse/HBASE-3190 > Project: HBase > Issue Type: Bug >Reporter: Sebastian Bauer > Fix For: 0.90.0 > > > Table disabling was interrupted by kill -9 all part of hbase and now we > cannot do anything with this table, disabling doesn't show any exception: > hbase(main):019:0> disable 'NGolden_CTU' > 0 row(s) in 0.0250 seconds > but droping show this: > hbase(main):020:0> drop 'NGolden_CTU' > ERROR: org.apache.hadoop.hbase.TableNotDisabledException: > org.apache.hadoop.hbase.TableNotDisabledException: NGolden_CTU > at > org.apache.hadoop.hbase.master.HMaster.checkTableModifiable(HMaster.java:861) > at > org.apache.hadoop.hbase.master.handler.TableEventHandler.(TableEventHandler.java:52) > at > org.apache.hadoop.hbase.master.handler.DeleteTableHandler.(DeleteTableHandler.java:42) > at > org.apache.hadoop.hbase.master.HMaster.deleteTable(HMaster.java:779) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1025) > Here is some help for this command: > Drop the named table. Table must first be disabled. If table has > more than one region, run a major compaction on .META.: > hbase> major_compact ".META." > after this nothing strange is in logs > when we restart hbase we get this: > 2010-11-03 08:56:37,892 DEBUG > org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Processing > open of > NGolden_CTU,3065-d_2010_10_14_245FF1A15F4E236002ED3AB651BAB97E,1288046281444.0c8579e52b0ea3f2dab5b6a857ad030b. > > 2010-11-03 08:56:37,892 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: > regionserver:60020-0x12c10b5fb780005 Attempting to transition node > 0c8579e52b0ea3f2dab5b6a857ad030b from M_ZK_REGION_OFFLINE to > RS_ZK_REGION_OPENING > 2010-11-03 08:56:37,892 ERROR org.apache.hadoop.hbase.executor.EventHandler: > Caught throwable while processing event M_RS_OPEN_REGION > > > java.lang.NullPointerException > > > > at > org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:75) > > > > at > org.apache.hadoop.hbase.executor.RegionTransitionData.fromBytes(RegionTransitionData.java:198) > > > > at > org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:669) > > > > at > org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNodeOpening(ZKAssign.java:549) > > > > at > org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNodeOpening(ZKAssign.java:542) > > > > at > org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.transitionZookeeperOfflineToOpening(OpenRegionHandler.java:208) >
[jira] Created: (HBASE-3190) Problem with disabling and droping table
Problem with disabling and droping table Key: HBASE-3190 URL: https://issues.apache.org/jira/browse/HBASE-3190 Project: HBase Issue Type: Bug Reporter: Sebastian Bauer Fix For: 0.90.0 Table disabling was interrupted by kill -9 all part of hbase and now we cannot do anything with this table, disabling doesn't show any exception: hbase(main):019:0> disable 'NGolden_CTU' 0 row(s) in 0.0250 seconds but droping show this: hbase(main):020:0> drop 'NGolden_CTU' ERROR: org.apache.hadoop.hbase.TableNotDisabledException: org.apache.hadoop.hbase.TableNotDisabledException: NGolden_CTU at org.apache.hadoop.hbase.master.HMaster.checkTableModifiable(HMaster.java:861) at org.apache.hadoop.hbase.master.handler.TableEventHandler.(TableEventHandler.java:52) at org.apache.hadoop.hbase.master.handler.DeleteTableHandler.(DeleteTableHandler.java:42) at org.apache.hadoop.hbase.master.HMaster.deleteTable(HMaster.java:779) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1025) Here is some help for this command: Drop the named table. Table must first be disabled. If table has more than one region, run a major compaction on .META.: hbase> major_compact ".META." after this nothing strange is in logs when we restart hbase we get this: 2010-11-03 08:56:37,892 DEBUG org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Processing open of NGolden_CTU,3065-d_2010_10_14_245FF1A15F4E236002ED3AB651BAB97E,1288046281444.0c8579e52b0ea3f2dab5b6a857ad030b. 2010-11-03 08:56:37,892 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x12c10b5fb780005 Attempting to transition node 0c8579e52b0ea3f2dab5b6a857ad030b from M_ZK_REGION_OFFLINE to RS_ZK_REGION_OPENING 2010-11-03 08:56:37,892 ERROR org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while processing event M_RS_OPEN_REGION java.lang.NullPointerException at org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:75) at org.apache.hadoop.hbase.executor.RegionTransitionData.fromBytes(RegionTransitionData.java:198) at org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNode(ZKAssign.java:669) at org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNodeOpening(ZKAssign.java:549) at org.apache.hadoop.hbase.zookeeper.ZKAssign.transitionNodeOpening(ZKAssign.java:542) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.transitionZookeeperOfflineToOpening(OpenRegionHandler.java:208) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:89) at org.apa
[jira] Commented: (HBASE-1956) Export HDFS read and write latency as a metric
[ https://issues.apache.org/jira/browse/HBASE-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12927759#action_12927759 ] stack commented on HBASE-1956: -- Thanks for looking into this Gary. > Export HDFS read and write latency as a metric > -- > > Key: HBASE-1956 > URL: https://issues.apache.org/jira/browse/HBASE-1956 > Project: HBase > Issue Type: Improvement >Reporter: Andrew Purtell >Assignee: Andrew Purtell >Priority: Minor > Fix For: 0.90.0 > > Attachments: HBASE-1956.patch, HBASE-1956.patch > > > HDFS write latency spikes especially are an indicator of general cluster > overloading. We see this where the WAL writer complains about writes taking > > 1 second, sometimes > 4, etc. If for example the average write latency over > the monitoring period is exported as a metric, then this can feed into > alerting for or automatic provisioning of additional cluster hardware. While > we're at it, export read side metrics as well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.