[jira] [Assigned] (HBASE-4946) HTable.coprocessorExec (and possibly coprocessorProxy) does not work with dynamically loaded coprocessors (from hdfs or local system), because the RPC system tries to de

2011-12-05 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4946:
-

Assignee: Andrei Dragomir

> HTable.coprocessorExec (and possibly coprocessorProxy) does not work with 
> dynamically loaded coprocessors (from hdfs or local system), because the RPC 
> system tries to deserialize an unknown class. 
> -
>
> Key: HBASE-4946
> URL: https://issues.apache.org/jira/browse/HBASE-4946
> Project: HBase
>  Issue Type: Bug
>  Components: coprocessors
>Affects Versions: 0.92.0
>Reporter: Andrei Dragomir
>Assignee: Andrei Dragomir
> Attachments: HBASE-4946-v2.patch, HBASE-4946.patch
>
>
> Loading coprocessors jars from hdfs works fine. I load it from the shell, 
> after setting the attribute, and it gets loaded:
> {noformat}
> INFO org.apache.hadoop.hbase.regionserver.HRegion: Setting up tabledescriptor 
> config now ...
> INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: Class 
> com.MyCoprocessorClass needs to be loaded from a file - 
> hdfs://localhost:9000/coproc/rt-  >0.0.1-SNAPSHOT.jar.
> INFO org.apache.hadoop.hbase.coprocessor.CoprocessorHost: loadInstance: 
> com.MyCoprocessorClass
> INFO org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: 
> RegionEnvironment createEnvironment
> DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Registered protocol 
> handler: region=t1,,1322572939753.6409aee1726d31f5e5671a59fe6e384f. 
> protocol=com.MyCoprocessorClassProtocol
> INFO org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost: Load 
> coprocessor com.MyCoprocessorClass from HTD of t1 successfully.
> {noformat}
> The problem is that this coprocessors simply extends BaseEndpointCoprocessor, 
> with a dynamic method. When calling this method from the client with 
> HTable.coprocessorExec, I get errors on the HRegionServer, because the call 
> cannot be deserialized from writables. 
> The problem is that Exec tries to do an "early" resolve of the coprocessor 
> class. The coprocessor class is loaded, but it is in the context of the 
> HRegionServer / HRegion. So, the call fails:
> {noformat}
> 2011-12-02 00:34:17,348 ERROR org.apache.hadoop.hbase.io.HbaseObjectWritable: 
> Error in readFields
> java.io.IOException: Protocol class com.MyCoprocessorClassProtocol not found
>   at org.apache.hadoop.hbase.client.coprocessor.Exec.readFields(Exec.java:125)
>   at 
> org.apache.hadoop.hbase.io.HbaseObjectWritable.readObject(HbaseObjectWritable.java:575)
>   at org.apache.hadoop.hbase.ipc.Invocation.readFields(Invocation.java:105)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.processData(HBaseServer.java:1237)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Connection.readAndProcess(HBaseServer.java:1167)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener.doRead(HBaseServer.java:703)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.doRunLoop(HBaseServer.java:495)
>   at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Listener$Reader.run(HBaseServer.java:470)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:680)
> Caused by: java.lang.ClassNotFoundException: com.MyCoprocessorClassProtocol
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:247)
>   at 
> org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:943)
>   at org.apache.hadoop.hbase.client.coprocessor.Exec.readFields(Exec.java:122)
>   ... 10 more
> {noformat}
> Probably the correct way to fix this is to make Exec really smart, so that it 
> knows all the class definitions loaded in CoprocessorHost(s).
> I created a small patch that simply doesn't resolve the class definition in 
> the Exec, instead passing it as string down to the HRegion layer. This layer 
> knows all the definitions, and simply loads it by name. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HBASE-4936) Cached HRegionInterface connections crash when getting UnknownHost exceptions

2011-12-03 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4936:
-

Assignee: Andrei Dragomir

> Cached HRegionInterface connections crash when getting UnknownHost exceptions
> -
>
> Key: HBASE-4936
> URL: https://issues.apache.org/jira/browse/HBASE-4936
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.92.0
>Reporter: Andrei Dragomir
>Assignee: Andrei Dragomir
> Attachments: HBASE-4936.patch
>
>
> This isssue is unlikely to come up in a cluster test case. However, for 
> development, the following thing happens: 
> 1. Start the HBase cluster locally, on network A (DNS A, etc)
> 2. The region locations are cached using the hostname 
> (mycomputer.company.com, 211.x.y.z - real ip)
> 3. Change network location (go home)
> 4. Start the HBase cluster locally. My hostname / ips are not different 
> (mycomputer, 192.168.0.130 - new ip)
> If the region locations have been cached using the hostname, there is an 
> UnknownHostException in CatalogTracker.getCachedConnection(ServerName sn), 
> uncaught in the catch statements. The server will crash constantly. 
> The error should be caught and not rethrown, so that the cached connection 
> expires normally. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4942) HMaster is unable to start of HFile V1 is used

2011-12-03 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4942:
-

Assignee: honghua zhu

> HMaster is unable to start of HFile V1 is used
> --
>
> Key: HBASE-4942
> URL: https://issues.apache.org/jira/browse/HBASE-4942
> Project: HBase
>  Issue Type: Bug
>  Components: io
>Affects Versions: 0.92.0
>Reporter: Ted Yu
>Assignee: honghua zhu
> Fix For: 0.92.0, 0.94.0
>
>
> This was reported by HH Zhu (zhh200...@gmail.com)
> If the following is specified in hbase-site.xml:
> {code}
> 
> hfile.format.version
> 1
> 
> {code}
> Clear the hdfs directory "hbase.rootdir" so that MasterFileSystem.bootstrap() 
> is executed.
> You would see:
> {code}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV1.close(HFileReaderV1.java:358)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader.close(StoreFile.java:1083)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFile.closeReader(StoreFile.java:570)
> at org.apache.hadoop.hbase.regionserver.Store.close(Store.java:441)
> at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:782)
> at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:717)
> at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:688)
> at 
> org.apache.hadoop.hbase.master.MasterFileSystem.bootstrap(MasterFileSystem.java:390)
> at 
> org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:356)
> at 
> org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:128)
> at 
> org.apache.hadoop.hbase.master.MasterFileSystem.(MasterFileSystem.java:113)
> at 
> org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:435)
> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:314)
> at java.lang.Thread.run(Thread.java:619)
> {code}
> The above exception would lead to:
> {code}
> java.lang.RuntimeException: HMaster Aborted
> at 
> org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:152)
> at 
> org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:103)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at 
> org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
> at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:1512)
> {code}
> In org.apache.hadoop.hbase.master.HMaster.HMaster(Configuration conf), we 
> have:
> {code}
> this.conf.setFloat(CacheConfig.HFILE_BLOCK_CACHE_SIZE_KEY, 0.0f);
> {code}
> When CacheConfig is instantiated, the following is called:
> {code}
> org.apache.hadoop.hbase.io.hfile.CacheConfig.instantiateBlockCache(Configuration
>  conf)
> {code}
> Since "hfile.block.cache.size" is 0.0, instantiateBlockCache() would return 
> null, resulting in blockCache field of CacheConfig to be null.
> When master closes Root region, 
> org.apache.hadoop.hbase.io.hfile.HFileReaderV1.close(boolean evictOnClose) 
> would be called. cacheConf.getBlockCache() returns null, leading to master 
> abort.
> The following should be called in HFileReaderV1.close(), similar to the code 
> in HFileReaderV2.close():
> {code}
> if (evictOnClose && cacheConf.isBlockCacheEnabled())
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4899) Region would be assigned twice easily with continually killing server and moving region in testing environment

2011-11-29 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4899:
-

Assignee: chunhui shen

> Region would be assigned twice easily with continually  killing server and 
> moving region in testing environment
> ---
>
> Key: HBASE-4899
> URL: https://issues.apache.org/jira/browse/HBASE-4899
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: chunhui shen
>Assignee: chunhui shen
> Attachments: hbase-4899.patch
>
>
> Before assigning region in ServerShutdownHandler#process, it will check 
> whether region is in RIT,
> however, this checking doesn't work as the excepted in the following case:
> 1.move region A from server B to server C
> 2.kill server B
> 3.start server B immediately
> Let's see what happen in the code for the above case
> {code}
> for step1:
> 1.1 server B close the region A,
> 1.2 master setOffline for region 
> A,(AssignmentManager#setOffline:this.regions.remove(regionInfo))
> 1.3 server C start to open region A.(Not completed)
> for step3:
> master ServerShutdownHandler#process() for server B
> {
> ..
> splitlog()
> ...
> List regionsInTransition =
> this.services.getAssignmentManager()
> .processServerShutdown(this.serverName);
> ...
> Skip regions that were in transition unless CLOSING or PENDING_CLOSE
> ...
> assign region
> }
> {code}
> In fact, when running 
> ServerShutdownHandler#process()#this.services.getAssignmentManager().processServerShutdown(this.serverName),
>  region A is in RIT (step1.3 not completed), but the return List 
> regionsInTransition doesn't contain it, because region A has removed from 
> AssignmentManager.regions by AssignmentManager#setOffline in step 1.2
> Therefore, region A will be assigned twice.
> Actually, one server killed and started twice will also easily cause region 
> assigned twice.
> Exclude the above reason, another probability : 
> when execute ServerShutdownHandler#process()#MetaReader.getServerUserRegions 
> ,region is included which is in RIT now.
> But after completing MetaReader.getServerUserRegions, the region has been 
> opened in other server and is not in RIT now.
> In our testing environment where balancing,moving and killing are executed 
> periodly, assigning region twice often happens, and it is hateful because it 
> will affect other test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4885) Building against Hadoop 0.23 uses out-of-date MapReduce artifacts

2011-11-28 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4885:
-

Assignee: Tom White

> Building against Hadoop 0.23 uses out-of-date MapReduce artifacts
> -
>
> Key: HBASE-4885
> URL: https://issues.apache.org/jira/browse/HBASE-4885
> Project: HBase
>  Issue Type: Bug
>  Components: build
>Reporter: Tom White
>Assignee: Tom White
> Fix For: 0.94.0
>
> Attachments: HBASE-4885.patch
>
>
> The "hadoop-mapred" artifacts have been replaced by "hadoop-mapreduce-*" 
> artifacts in 0.23 onwards.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4773) HBaseAdmin may leak ZooKeeper connections

2011-11-28 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4773:
-

Assignee: xufeng

> HBaseAdmin may leak ZooKeeper connections
> -
>
> Key: HBASE-4773
> URL: https://issues.apache.org/jira/browse/HBASE-4773
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.4
>Reporter: gaojinchao
>Assignee: xufeng
>Priority: Critical
> Fix For: 0.90.5
>
> Attachments: 4773.patch, branches_4773.patch, trunk_4773_patch.patch
>
>
> When master crashs, HBaseAdmin will leaks ZooKeeper connections
> I think we should close the zk connetion when throw MasterNotRunningException
>  public HBaseAdmin(Configuration c)
>   throws MasterNotRunningException, ZooKeeperConnectionException {
> this.conf = HBaseConfiguration.create(c);
> this.connection = HConnectionManager.getConnection(this.conf);
> this.pause = this.conf.getLong("hbase.client.pause", 1000);
> this.numRetries = this.conf.getInt("hbase.client.retries.number", 10);
> this.retryLongerMultiplier = 
> this.conf.getInt("hbase.client.retries.longer.multiplier", 10);
> //we should add this code and close the zk connection
> try{
>   this.connection.getMaster();
> }catch(MasterNotRunningException e){
>   HConnectionManager.deleteConnection(conf, false);
>   throw e;  
> }
>   }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4862) Split hlog and open region concurrently happend may cause data loss

2011-11-24 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4862:
-

Assignee: chunhui shen

> Split hlog and open region concurrently happend may cause data loss
> ---
>
> Key: HBASE-4862
> URL: https://issues.apache.org/jira/browse/HBASE-4862
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.2
>Reporter: chunhui shen
>Assignee: chunhui shen
> Fix For: 0.92.0, 0.94.0, 0.90.5
>
> Attachments: 4862.patch
>
>
> Case Description:
> 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 
> and is appending log entry
> 2.Regionserver is opening region A now, and in the process 
> replayRecoveredEditsIfAny() ,it will delete the file region 
> A/recoverd.edits/123456 
> 3.Split hlog thread catches the io exception, and stop parse this log file 
> and if skipError = true , add it to the corrupt logsHowever, data in 
> other regions in this log file will loss 
> 4.Or if skipError = false, it will check filesystem.Of course, the file 
> system is ok , and it only prints a error log, continue assigning regions. 
> Therefore, data in other log files will also loss!!
> The case may happen in the following:
> 1.Move region from server A to server B
> 2.kill server A and Server B
> 3.restart server A and Server B
> We could prevent this exception throuth forbiding deleting  recover.edits 
> file 
> which is appending by split hlog thread

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4864) testRegionTransitionOperations occasional failures

2011-11-24 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4864:
-

Assignee: gaojinchao

> testRegionTransitionOperations occasional failures
> --
>
> Key: HBASE-4864
> URL: https://issues.apache.org/jira/browse/HBASE-4864
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Reporter: gaojinchao
>Assignee: gaojinchao
>Priority: Minor
> Fix For: 0.92.0, 0.94.0
>
> Attachments: HBASE-4864_Branch92.patch
>
>
> looks this logs:
> https://builds.apache.org/job/HBase-TRUNK-security/ws/trunk/target/surefire-reports/
> It seems that we should wait region is added to online region set.
> I made a patch, Please review.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4857) Recursive loop on KeeperException in AuthenticationTokenSecretManager/ZKLeaderManager

2011-11-23 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4857:
-

Assignee: Gary Helmling

> Recursive loop on KeeperException in 
> AuthenticationTokenSecretManager/ZKLeaderManager
> -
>
> Key: HBASE-4857
> URL: https://issues.apache.org/jira/browse/HBASE-4857
> Project: HBase
>  Issue Type: Bug
>  Components: security
>Affects Versions: 0.92.0, 0.94.0
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Fix For: 0.92.0
>
> Attachments: HBASE-4857.patch
>
>
> Looking through stack traces for {{TestMasterFailover}}, I see a case where 
> the leader {{AuthenticationTokenSecretManager}} can get into a recursive loop 
> when a {{KeeperException}} is encountered:
> {noformat}
> Thread-1-EventThread" daemon prio=10 tid=0x7f9fb47b2800 nid=0x77f6 
> waiting on condition [0x7f9fab376000]
>java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep(Native Method)
> at java.lang.Thread.sleep(Thread.java:302)
> at java.util.concurrent.TimeUnit.sleep(TimeUnit.java:328)
> at 
> org.apache.hadoop.hbase.util.RetryCounter.sleepUntilNextRetry(RetryCounter.java:55)
> at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:206)
> at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:891)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.createBaseZNodes(ZooKeeperWatcher.java:161)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.(ZooKeeperWatcher.java:154)
> at 
> org.apache.hadoop.hbase.master.HMaster.tryRecoveringExpiredZKSession(HMaster.java:1397)
> at org.apache.hadoop.hbase.master.HMaster.abortNow(HMaster.java:1435)
> at org.apache.hadoop.hbase.master.HMaster.abort(HMaster.java:1374)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.abort(ZooKeeperWatcher.java:450)
> at 
> org.apache.hadoop.hbase.zookeeper.ZKLeaderManager.stepDownAsLeader(ZKLeaderManager.java:166)
> at 
> org.apache.hadoop.hbase.security.token.AuthenticationTokenSecretManager$LeaderElector.stop(AuthenticationTokenSecretManager.java:293)
> at 
> org.apache.hadoop.hbase.zookeeper.ZKLeaderManager.stepDownAsLeader(ZKLeaderManager.java:167)
> at 
> org.apache.hadoop.hbase.security.token.AuthenticationTokenSecretManager$LeaderElector.stop(AuthenticationTokenSecretManager.java:293)
> at 
> org.apache.hadoop.hbase.zookeeper.ZKLeaderManager.stepDownAsLeader(ZKLeaderManager.java:167)
> at 
> org.apache.hadoop.hbase.security.token.AuthenticationTokenSecretManager$LeaderElector.stop(AuthenticationTokenSecretManager.java:293)
> at 
> org.apache.hadoop.hbase.zookeeper.ZKLeaderManager.handleLeaderChange(ZKLeaderManager.java:96)
> at 
> org.apache.hadoop.hbase.zookeeper.ZKLeaderManager.nodeDeleted(ZKLeaderManager.java:78)
> at 
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:286)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:521)
> at 
> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:497)
> {noformat}
> The {{KeeperException}} causes {{ZKLeaderManager}} to call 
> {{AuthenticationTokenSecretManager$LeaderElector.stop()}}, which calls 
> {{ZKLeaderManager.stepDownAsLeader()}}, which will encounter another 
> {{KeeperException}}, and so on...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4856) Upgrade zookeeper to 3.4.0 release

2011-11-23 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4856:
-

Assignee: Ted Yu

> Upgrade zookeeper to 3.4.0 release
> --
>
> Key: HBASE-4856
> URL: https://issues.apache.org/jira/browse/HBASE-4856
> Project: HBase
>  Issue Type: Task
>Reporter: Ted Yu
>Assignee: Ted Yu
>
> Zookeeper 3.4.0 has been released.
> We should upgade.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4839) Re-enable TestInstantSchemaChangeFailover#testInstantSchemaOperationsInZKForMasterFailover

2011-11-21 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4839:
-

Assignee: Subbu M Iyer

> Re-enable 
> TestInstantSchemaChangeFailover#testInstantSchemaOperationsInZKForMasterFailover
> --
>
> Key: HBASE-4839
> URL: https://issues.apache.org/jira/browse/HBASE-4839
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Subbu M Iyer
>
> TestInstantSchemaChangeFailover#testInstantSchemaOperationsInZKForMasterFailover
>  was disabled for instant schema change (HBASE-4213) after it failed on 
> Jenkins.
> We should enable it and make it pass on Jenkins and dev enviroments.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4799) Catalog Janitor logic bug causes region leackage

2011-11-16 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4799:
-

Assignee: Max Lapan

> Catalog Janitor logic bug causes region leackage
> 
>
> Key: HBASE-4799
> URL: https://issues.apache.org/jira/browse/HBASE-4799
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.90.4
>Reporter: Max Lapan
>Assignee: Max Lapan
>Priority: Critical
> Attachments: 0001-Fix-of-Regions-Leaks-problem-in-janitor.patch, 
> 0002-Temporary-fix-to-remove-leaked-regions.patch
>
>
> When region split takes a significant amount of time, CatalogJanitor can 
> cleanup one of SPLIT records, but left another in META. When another split 
> finish, janitor cleans left SPLIT record, but parent regions haven't removed 
> from FS and META not cleared.
> The race condition is follows:
> 1. region split started
> 2. one of regions splitted, i.e. A (have no reference storefiles) but other 
> (B) doesn't
> 3. janitor started and in routine checkDaughter removes SPLITA from meta, but 
> see that SPLITB has references and does nothing.
> 4. region B completes split
> 5. janitor wakes up, removes SPLITB, but see that there is no records for A 
> and does nothing again.
> Result - parent region hangs forever.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4478) Improve AssignmentManager.handleRegion so that it can process certain ZK state in the case of RS offline

2011-11-10 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4478:
-

Assignee: ramkrishna.s.vasudevan  (was: Ming Ma)

> Improve AssignmentManager.handleRegion so that it can process certain ZK 
> state in the case of RS offline
> 
>
> Key: HBASE-4478
> URL: https://issues.apache.org/jira/browse/HBASE-4478
> Project: HBase
>  Issue Type: Bug
>Reporter: Ming Ma
>Assignee: ramkrishna.s.vasudevan
>
> Currently AssignmentManager.handleRegion skips processing of ZK event change 
> if the RS is offline. It relies on TimeoutMonitor and ServerShutdownHandler 
> to process RIT.
>   // Verify this is a known server
>   if (!serverManager.isServerOnline(sn) &&
>   !this.master.getServerName().equals(sn)) {
> LOG.warn("Attempted to handle region transition for server but " +
>   "server is not online: " + Bytes.toString(data.getRegionName()));
> return;
>   }
> For certain states like OPENED, OPENING, FAILED_OPEN, CLOSED, it can continue 
> the progressing even if the RS is offline.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4746) Use a random ZK client port in unit tests so we can run them in parallel

2011-11-06 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4746:
-

Assignee: Mikhail Bautin

> Use a random ZK client port in unit tests so we can run them in parallel
> 
>
> Key: HBASE-4746
> URL: https://issues.apache.org/jira/browse/HBASE-4746
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
> Attachments: 4746-trunk-v2.txt, D255.1.patch, D279.1.patch, 
> D279.2.patch
>
>
> The hard-coded ZK client port has long been a problem for running HBase test 
> suite in parallel. The mini ZK cluster should run on a random free port, and 
> that port should be passed to all parts of the unit tests that need to talk 
> to the mini cluster. In fact, randomizing the port exposes a lot of places in 
> the code where a new configuration is instantiated, and as a result the 
> client tries to talk to the default ZK client port and times out.
> The initial fix is for 0.89-fb, where it already allows to run unit tests in 
> parallel in 10 minutes. A fix for the trunk will follow.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4752) Don't create an unnecessary LinkedList when evicting from the BlockCache

2011-11-06 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4752:
-

Assignee: Ted Yu  (was: Benoit Sigoure)

> Don't create an unnecessary LinkedList when evicting from the BlockCache
> 
>
> Key: HBASE-4752
> URL: https://issues.apache.org/jira/browse/HBASE-4752
> Project: HBase
>  Issue Type: Improvement
>  Components: performance, regionserver
>Affects Versions: 0.90.4
>Reporter: Benoit Sigoure
>Assignee: Ted Yu
>Priority: Minor
> Attachments: 
> 0001-HBASE-4752-Don-t-create-an-unnecessary-LinkedList-wh.patch, 
> 4752-trunk.txt
>
>
> When evicting from the BlockCache, the code creates a LinkedList containing 
> every single block sorted by access time.  This list is created from a 
> PriorityQueue.  I don't believe it is necessary, as the PriorityQueue can be 
> used directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4751) Make TestAdmin#testEnableTableRoundRobinAssignment friendly to concurrent tests

2011-11-04 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4751:
-

Assignee: Jieshan Bean

> Make TestAdmin#testEnableTableRoundRobinAssignment friendly to concurrent 
> tests
> ---
>
> Key: HBASE-4751
> URL: https://issues.apache.org/jira/browse/HBASE-4751
> Project: HBase
>  Issue Type: Task
>Reporter: Ted Yu
>Assignee: Jieshan Bean
>
> From 
> https://builds.apache.org/view/G-L/view/HBase/job/HBase-TRUNK/2410/artifact/trunk/target/surefire-reports/org.apache.hadoop.hbase.client.TestAdmin.txt
>  :
> {code}
> testEnableTableRoundRobinAssignment(org.apache.hadoop.hbase.client.TestAdmin) 
>  Time elapsed: 4.345 sec  <<< ERROR!
> java.lang.IllegalArgumentException: Check the value configured in 
> 'zookeeper.znode.parent'. There could be a mismatch with the one configured 
> in the master.
>   at 
> org.apache.hadoop.hbase.zookeeper.RootRegionTracker.waitRootRegionLocation(RootRegionTracker.java:81)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:753)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:733)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:866)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:765)
>   at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:733)
>   at org.apache.hadoop.hbase.client.HTable.(HTable.java:202)
>   at org.apache.hadoop.hbase.client.HTable.(HTable.java:157)
>   at 
> org.apache.hadoop.hbase.client.TestAdmin.testEnableTableRoundRobinAssignment(TestAdmin.java:604)
> {code}
> This was due to:
> {code}
> HTable metaTable = new HTable(HConstants.META_TABLE_NAME);
> {code}
> A few lines above, we have the correct usage:
> {code}
> HTable ht = new HTable(TEST_UTIL.getConfiguration(), tableName);
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4741) Online schema change doesn't return errors

2011-11-04 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4741:
-

Assignee: (was: Ted Yu)

I may not have time to work on this in the next week.

> Online schema change doesn't return errors
> --
>
> Key: HBASE-4741
> URL: https://issues.apache.org/jira/browse/HBASE-4741
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jean-Daniel Cryans
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: 4741-v2.txt, 4741-v3.txt, 4741-v4.txt, 4741.txt
>
>
> Still after the fun I had over in HBASE-4729, I tried to finish altering my 
> table (remove a family) since only half of it was changed so I did this:
> {quote}
> hbase(main):002:0> alter 'TestTable', NAME => 'allo', METHOD => 'delete' 
> Updating all regions with the new schema...
> 244/244 regions updated.
> Done.
> 0 row(s) in 1.2480 seconds
> {quote}
> Nice it all looks good, but over in the master log:
> {quote}
> org.apache.hadoop.hbase.InvalidFamilyOperationException: Family 'allo' does 
> not exist so cannot be deleted
> at 
> org.apache.hadoop.hbase.master.handler.TableDeleteFamilyHandler.handleTableOperation(TableDeleteFamilyHandler.java:56)
> at 
> org.apache.hadoop.hbase.master.handler.TableEventHandler.process(TableEventHandler.java:86)
> at 
> org.apache.hadoop.hbase.master.HMaster.deleteColumn(HMaster.java:1011)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:348)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1242)
> {quote}
> Maybe we should do checks before launching the async task.
> Marking critical as this is a regression.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4741) Online schema change doesn't return errors

2011-11-03 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4741:
-

Assignee: Ted Yu

> Online schema change doesn't return errors
> --
>
> Key: HBASE-4741
> URL: https://issues.apache.org/jira/browse/HBASE-4741
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jean-Daniel Cryans
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 0.92.0
>
>
> Still after the fun I had over in HBASE-4729, I tried to finish altering my 
> table (remove a family) since only half of it was changed so I did this:
> {quote}
> hbase(main):002:0> alter 'TestTable', NAME => 'allo', METHOD => 'delete' 
> Updating all regions with the new schema...
> 244/244 regions updated.
> Done.
> 0 row(s) in 1.2480 seconds
> {quote}
> Nice it all looks good, but over in the master log:
> {quote}
> org.apache.hadoop.hbase.InvalidFamilyOperationException: Family 'allo' does 
> not exist so cannot be deleted
> at 
> org.apache.hadoop.hbase.master.handler.TableDeleteFamilyHandler.handleTableOperation(TableDeleteFamilyHandler.java:56)
> at 
> org.apache.hadoop.hbase.master.handler.TableEventHandler.process(TableEventHandler.java:86)
> at 
> org.apache.hadoop.hbase.master.HMaster.deleteColumn(HMaster.java:1011)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:348)
> at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1242)
> {quote}
> Maybe we should do checks before launching the async task.
> Marking critical as this is a regression.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-3316) Add support for Java Serialization to HbaseObjectWritable

2011-10-31 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-3316:
-

Assignee: Ed Kohlwey

> Add support for Java Serialization to HbaseObjectWritable
> -
>
> Key: HBASE-3316
> URL: https://issues.apache.org/jira/browse/HBASE-3316
> Project: HBase
>  Issue Type: New Feature
>  Components: io
>Affects Versions: 0.92.0
>Reporter: Ed Kohlwey
>Assignee: Ed Kohlwey
>Priority: Minor
> Fix For: 0.92.0
>
> Attachments: HBASE-3316.patch
>
>
> It is convenient in some situations to have HbaseObjectWritable write 
> serializable Java objects, for instance when prototyping new code where you 
> don't want to take the time to implement a writable.
> Adding this support requires no overhead compared the current implementation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4690) Intermittent TestRegionServerCoprocessorExceptionWithAbort#testExceptionFromCoprocessorDuringPut failure

2011-10-29 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4690:
-

Assignee: Ted Yu  (was: Eugene Koontz)

> Intermittent 
> TestRegionServerCoprocessorExceptionWithAbort#testExceptionFromCoprocessorDuringPut
>  failure
> 
>
> Key: HBASE-4690
> URL: https://issues.apache.org/jira/browse/HBASE-4690
> Project: HBase
>  Issue Type: Test
>Affects Versions: 0.92.0
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 0.92.0
>
> Attachments: 4690-trunk.txt
>
>
> See 
> https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92/83/testReport/junit/org.apache.hadoop.hbase.coprocessor/TestRegionServerCoprocessorExceptionWithAbort/testExceptionFromCoprocessorDuringPut/
> Somehow getRSForFirstRegionInTable() wasn't able to retrieve the region 
> server.
> One fix for this issue is to spin up MiniCluster with 1 region server so that 
> we don't need to search for the region server where first region is hosted.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4702) Allow override of scan cache value for rowcounter

2011-10-29 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4702:
-

Assignee: Ted Yu

> Allow override of scan cache value for rowcounter
> -
>
> Key: HBASE-4702
> URL: https://issues.apache.org/jira/browse/HBASE-4702
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.2
> Environment: Operating System: Linux
>Reporter: Rita M
>Assignee: Ted Yu
>
> Doing a row count for a large table via Mapreduce may take long time.
> Trying to set the default cache size but there is no knob to tune it.
> See here for more details, 
> http://search-hadoop.com/m/ECEs6237AIX&subj=Re+speeding+up+rowcount

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4641) Block cache can be mistakenly instantiated on Master

2011-10-28 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4641:
-

Assignee: Ted Yu  (was: Jonathan Gray)

> Block cache can be mistakenly instantiated on Master
> 
>
> Key: HBASE-4641
> URL: https://issues.apache.org/jira/browse/HBASE-4641
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.92.0, 0.94.0
>Reporter: Jonathan Gray
>Assignee: Ted Yu
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: 4641-suggestion-v3.txt, 4641-v4.txt, 
> HBASE-4641-v1.patch, HBASE-4641-v2.patch
>
>
> After changes in the block cache instantiation over in HBASE-4422, it looks 
> like the HMaster can now end up with a block cache instantiated.  Not a huge 
> deal but prevents the process from shutting down properly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4669) Add an option of using round-robin assignment for enabling table

2011-10-27 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4669:
-

Assignee: Jieshan Bean

> Add an option of using round-robin assignment for enabling table
> 
>
> Key: HBASE-4669
> URL: https://issues.apache.org/jira/browse/HBASE-4669
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 0.90.4, 0.94.0
>Reporter: Jieshan Bean
>Assignee: Jieshan Bean
>Priority: Minor
> Fix For: 0.94.0
>
> Attachments: HBASE-4669-90-V2.patch, HBASE-4669-90.patch, 
> HBASE-4669-Trunk-V2.patch, HBASE-4669-Trunk.patch
>
>
> Under some scenarios, we use the function of disable/enable HTable. But 
> currently, enable HTable uses the random-assignment. We hope all the regions 
> show a better distribution, no matter how many regions and how many 
> regionservers.
> So I suggest to add an option of using round-robin assignment on 
> enable-table. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4578) NPE when altering a table that has moving regions

2011-10-22 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4578:
-

Assignee: gaojinchao

> NPE when altering a table that has moving regions
> -
>
> Key: HBASE-4578
> URL: https://issues.apache.org/jira/browse/HBASE-4578
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jean-Daniel Cryans
>Assignee: gaojinchao
>Priority: Blocker
> Fix For: 0.92.0
>
> Attachments: HBASE-4578_trial_Trunk.patch
>
>
> I'm still not a 100% sure on the source of this error, but here's what I was 
> able to get twice while altering a table that was doing a bunch of splits:
> {quote}
> 2011-10-11 23:48:59,344 INFO 
> org.apache.hadoop.hbase.master.handler.SplitRegionHandler: Handled SPLIT 
> report); 
> parent=TestTable,0002608338,1318376880454.a75d6815fdfc513fb1c8aabe086c6763. 
> daughter 
> a=TestTable,0002608338,1318376938764.ef170ff6cd8695dc8aec92e542dc9ac1.daughter
>  b=TestTable,0003301408,1318376938764.36eb2530341bd46888ede312c5559b5d.
> 2011-10-11 23:49:09,579 DEBUG 
> org.apache.hadoop.hbase.master.handler.TableEventHandler: Ignoring table not 
> disabled exception for supporting online schema changes.
> 2011-10-11 23:49:09,580 INFO 
> org.apache.hadoop.hbase.master.handler.TableEventHandler: Handling table 
> operation C_M_MODIFY_TABLE on table TestTable
> 2011-10-11 23:49:09,612 INFO org.apache.hadoop.hbase.util.FSUtils: 
> TableInfoPath = hdfs://sv4r11s38:9100/hbase/TestTable/.tableinfo tmpPath = 
> hdfs://sv4r11s38:9100/hbase/TestTable/.tmp/.tableinfo.1318376949612
> 2011-10-11 23:49:09,692 INFO org.apache.hadoop.hbase.util.FSUtils: 
> TableDescriptor stored. TableInfoPath = 
> hdfs://sv4r11s38:9100/hbase/TestTable/.tableinfo
> 2011-10-11 23:49:09,693 INFO org.apache.hadoop.hbase.util.FSUtils: Updated 
> tableinfo=hdfs://sv4r11s38:9100/hbase/TestTable/.tableinfo to blah
> 2011-10-11 23:49:09,695 INFO 
> org.apache.hadoop.hbase.master.handler.TableEventHandler: Bucketing regions 
> by region server...
> 2011-10-11 23:49:09,695 DEBUG org.apache.hadoop.hbase.client.MetaScanner: 
> Scanning .META. starting at row=TestTable,,00 for max=2147483647 
> rows
> 2011-10-11 23:49:09,709 DEBUG 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: 
> The connection to hconnection-0x132f043bbde02e9 has been closed.
> 2011-10-11 23:49:09,709 ERROR org.apache.hadoop.hbase.executor.EventHandler: 
> Caught throwable while processing event C_M_MODIFY_TABLE
> java.lang.NullPointerException
>   at java.util.TreeMap.getEntry(TreeMap.java:324)
>   at java.util.TreeMap.containsKey(TreeMap.java:209)
>   at 
> org.apache.hadoop.hbase.master.handler.TableEventHandler.reOpenAllRegions(TableEventHandler.java:114)
>   at 
> org.apache.hadoop.hbase.master.handler.TableEventHandler.process(TableEventHandler.java:90)
>   at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:168)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>   at java.lang.Thread.run(Thread.java:662)
> {quote}
> The first time the shell reported that all the regions were updated 
> correctly, the second time it got stuck for a while:
> {quote}
> 6/14 regions updated.
> 0/14 regions updated.
> ...
> 0/14 regions updated.
> 2/16 regions updated.
> ...
> 2/16 regions updated.
> 8/9 regions updated.
> ...
> 8/9 regions updated.
> {quote}
> After which I killed it, redid the alter and it worked.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-3714) completebulkload does not use HBase configuration

2011-10-21 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-3714:
-

Assignee: Nichole Treadway

> completebulkload does not use HBase configuration
> -
>
> Key: HBASE-3714
> URL: https://issues.apache.org/jira/browse/HBASE-3714
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.90.0, 0.90.1, 0.90.2, 0.90.3
>Reporter: Nichole Treadway
>Assignee: Nichole Treadway
>Priority: Minor
> Attachments: 3714-trunk.txt, HBASE-3714.txt
>
>
> The completebulkupload tool should be using the HBaseConfiguration.create() 
> method to get the HBase configuration in 0.90.*. In it's present state, you 
> receive a connection error when running this tool.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4644) LoadIncrementalHFiles ignores additional configurations

2011-10-21 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4644:
-

Assignee: Alexey Zotov

> LoadIncrementalHFiles ignores additional configurations
> ---
>
> Key: HBASE-4644
> URL: https://issues.apache.org/jira/browse/HBASE-4644
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.3
> Environment: Centos 5.5, Cloudera cdh3u1 distribution.
>Reporter: Alexey Zotov
>Assignee: Alexey Zotov
>Priority: Minor
>  Labels: Configuration, LoadIncrementalHFiles
>
> Method run ignores configuration, which was passed in as constructor argument:
> {code}
> LoadIncrementalHFiles hFilesMergeTask = new LoadIncrementalHFiles(conf);
> ToolRunner.run(hFilesMergeTask, args); 
> {code}
> This happens because HTable creation (_new HTable(tableName);_ in 
> LoadIncrementalHFiles.run() method) skips existing configuration and tries to 
> create a new one for HTable. If there is no hbase-site.xml in classpath, 
> previously loaded properties (via -conf ) will be missed. 
> Quick fix:
> {code}
> --- LoadIncrementalHFiles.java2011-07-18 08:20:38.0 +0400
> +++ LoadIncrementalHFiles.java2011-10-19 18:08:31.228972054 +0400
> @@ -447,14 +446,20 @@
>  if (!tableExists) this.createTable(tableName,dirPath);
>  
>  Path hfofDir = new Path(dirPath);
> -HTable table = new HTable(tableName);
> +HTable table;
> +Configuration configuration = getConf();
> +if (configuration != null) {
> +  table = new HTable(configuration, tableName);
> +} else {
> +  table = new HTable(tableName);
> +}
>  
>  doBulkLoad(hfofDir, table);
>  return 0;
>}
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4612) Allow ColumnPrefixFilter to support multiple prefixes

2011-10-18 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4612:
-

Assignee: Eran Kutner

> Allow ColumnPrefixFilter to support multiple prefixes
> -
>
> Key: HBASE-4612
> URL: https://issues.apache.org/jira/browse/HBASE-4612
> Project: HBase
>  Issue Type: Improvement
>  Components: filters
>Affects Versions: 0.90.4
>Reporter: Eran Kutner
>Assignee: Eran Kutner
>Priority: Minor
> Fix For: 0.94.0
>
> Attachments: HBASE-4612-0.90.patch
>
>
> When having a lot of columns grouped by name I've found that it would be very 
> useful to be able to scan them using multiple prefixes, allowing to fetch 
> specific groups in one scan, without fetching the entire row. This is 
> impossible to achieve using a FilterList, so I've added such support to the 
> existing ColmnPrefixFilter while keeping backward compatibility.
> The attached patch is based on 0.90.4, I noticed that the 0.92 branch has a 
> new method to support instantiating filters using Thrift. I'm not sure how 
> the serialization works there so I didn't implement that, but the rest of my 
> code should work in 0.92 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4562) When split doing offlineParentInMeta encounters error, it'll cause data loss

2011-10-16 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4562:
-

Assignee: bluedavy

> When split doing offlineParentInMeta encounters error, it'll cause data loss
> 
>
> Key: HBASE-4562
> URL: https://issues.apache.org/jira/browse/HBASE-4562
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.4
>Reporter: bluedavy
>Assignee: bluedavy
>Priority: Blocker
> Fix For: 0.90.5
>
> Attachments: HBASE-4562-0.90.patch, HBASE-4562-0.92.patch, 
> HBASE-4562-trunk.patch, test-4562-0.90.txt, test-4562-0.92.txt, 
> test-4562-trunk.txt
>
>
> Follow below steps to replay the problem:
> 1. change the SplitTransaction.java as below,just like mock the timeout error.
>{code:title=SplitTransaction.java|borderStyle=solid}
>   if (!testing) {
> MetaEditor.offlineParentInMeta(server.getCatalogTracker(),
>this.parent.getRegionInfo(), a.getRegionInfo(), b.getRegionInfo());
> throw new IOException("some unexpected error in split");
>   }
>{code} 
> 2. update the regionserver code,restart;
> 3. create a table & put some data to the table;
> 4. split the table;
> 5. kill the regionserver hosted the table;
> 6. wait some time after master ServerShutdownHandler.process execute,then 
> scan the table,u'll find the data wrote before lost.
> We can fix the bug just use the patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4595) HFilePrettyPrinter Scanned kv count always 0

2011-10-15 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4595:
-

Assignee: Matteo Bertozzi

> HFilePrettyPrinter Scanned kv count always 0
> 
>
> Key: HBASE-4595
> URL: https://issues.apache.org/jira/browse/HBASE-4595
> Project: HBase
>  Issue Type: Bug
>  Components: io
>Affects Versions: 0.92.0, 0.94.0, 0.92.1
>Reporter: Matteo Bertozzi
>Assignee: Matteo Bertozzi
>Priority: Minor
> Attachments: HBASE-4595.patch
>
>
> The "count" variable used to print the "Scanned kv count" is never 
> incremented.
> A local "count" variable in scanKeysValues() method is updated instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4550) When master passed regionserver different address , because regionserver didn't create new zookeeper znode, as a result stop-hbase.sh is hang

2011-10-12 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4550:
-

Assignee: wanbin

> When master passed regionserver different address , because regionserver 
> didn't create new zookeeper znode,  as  a result stop-hbase.sh is hang
> ---
>
> Key: HBASE-4550
> URL: https://issues.apache.org/jira/browse/HBASE-4550
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.90.3
>Reporter: wanbin
>Assignee: wanbin
> Fix For: 0.90.5
>
> Attachments: patch
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> when master passed regionserver different address, regionserver didn't create 
> new zookeeper znode, master store new address in ServerManager, when call 
> stop-hbase.sh , RegionServerTracker.nodeDeleted received path is old address, 
> serverManager.expireServer is not be called. so stop-hbase.sh is hang.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4508) Backport HBASE-3777 to 0.90 branch

2011-10-12 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4508:
-

Assignee: Bright Fulton

> Backport HBASE-3777 to 0.90 branch
> --
>
> Key: HBASE-4508
> URL: https://issues.apache.org/jira/browse/HBASE-4508
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Bright Fulton
> Attachments: HBASE-4508.v1.patch, HBASE-4508.v2.patch, 
> HBASE-4508.v3.patch, HBASE-4508.v4.patch
>
>
> See discussion here: 
> http://search-hadoop.com/m/MJBId1aazTR1/backporting+HBASE-3777+to+0.90&subj=backporting+HBASE+3777+to+0+90
> Rocketfuel has been running 0.90.3 with HBASE-3777 since its resolution.
> They have 10 RS nodes , 1 Master and 1 Zookeeper
> Live writes and reads but super heavy on reads. Cache hit is pretty high.
> The qps on one of their data centers is 50K.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-2794) ROWCOL bloom filter not used if multiple columns within same family are requested in a Get

2011-09-30 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-2794:
-

Assignee: Mikhail Bautin

> ROWCOL bloom filter not used if multiple columns within same family are 
> requested in a Get
> --
>
> Key: HBASE-2794
> URL: https://issues.apache.org/jira/browse/HBASE-2794
> Project: HBase
>  Issue Type: Improvement
>  Components: performance
>Reporter: Kannan Muthukkaruppan
>Assignee: Mikhail Bautin
> Fix For: 0.92.0
>
>
> Noticed the following snippet in StoreFile.java:Scanner:shouldSeek():
> {code}
> switch(bloomFilterType) {
>   case ROW:
> key = row;
> break;
>   case ROWCOL:
> if (columns.size() == 1) {
>   byte[] col = columns.first();
>   key = Bytes.add(row, col);
>   break;
> }
> //$FALL-THROUGH$
>   default:
> return true;
> }
> {code}
> If columns.size > 1, then we currently don't take advantage of the bloom 
> filter.  We should optimize this to check bloom for each of columns and if 
> none of the columns are present in the bloom avoid opening the file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4492) TestRollingRestart fails intermittently

2011-09-28 Thread Ted Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reassigned HBASE-4492:
-

Assignee: ramkrishna.s.vasudevan  (was: Jonathan Gray)

> TestRollingRestart fails intermittently
> ---
>
> Key: HBASE-4492
> URL: https://issues.apache.org/jira/browse/HBASE-4492
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: ramkrishna.s.vasudevan
> Attachments: 4492-v2.txt, 4492.txt, HBASE-4492.patch
>
>
> I got the following when running test suite on TRUNK:
> {code}
> testBasicRollingRestart(org.apache.hadoop.hbase.master.TestRollingRestart)  
> Time elapsed: 300.28 sec  <<< ERROR!
> java.lang.Exception: test timed out after 30 milliseconds
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.hbase.master.TestRollingRestart.waitForRSShutdownToStartAndFinish(TestRollingRestart.java:313)
> at 
> org.apache.hadoop.hbase.master.TestRollingRestart.testBasicRollingRestart(TestRollingRestart.java:210)
> {code}
> I ran TestRollingRestart#testBasicRollingRestart manually afterwards which 
> wiped out test output file for the failed test.
> Similar failure can be found on Jenkins:
> https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.92/19/testReport/junit/org.apache.hadoop.hbase.master/TestRollingRestart/testBasicRollingRestart/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira