[jira] [Commented] (HBASE-3677) Generate a globally unique identifier for a cluster and store in /hbase/hbase.id

2011-04-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016298#comment-13016298
 ] 

Hudson commented on HBASE-3677:
---

Integrated in HBase-TRUNK #1832 (See 
[https://hudson.apache.org/hudson/job/HBase-TRUNK/1832/])
HBASE-3677  Generate a globally unique cluster ID


 Generate a globally unique identifier for a cluster and store in 
 /hbase/hbase.id
 

 Key: HBASE-3677
 URL: https://issues.apache.org/jira/browse/HBASE-3677
 Project: HBase
  Issue Type: Improvement
  Components: master
Reporter: Gary Helmling
Assignee: Gary Helmling
 Fix For: 0.92.0

 Attachments: HBASE-3677_final.patch


 We don't currently have a way to uniquely identify an HBase cluster, apart 
 for where it's stored in HDFS or configuration of the ZooKeeper quorum 
 managing it.  It would be generally useful to be able to identify a cluster 
 via API.
 The proposal here is pretty simple:
 # When master initializes the filesystem, generate a globally unique ID and 
 store in /hbase/hbase.id
 # For existing clusters, generate hbase.id on master startup if it does not 
 exist
 # Include unique ID in ClusterStatus returned from master
 For token authentication, this will be required to allow selecting the 
 correct token to pass to a cluster when a single client is communicating to 
 more than one HBase instance.
 Chatting with J-D, replication stores it's own cluster id in place with each 
 HLog edit, so requires as small as possible an identifier, but I think we 
 could automate a mapping from unique cluster ID - short ID if we had the 
 unique ID available.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-3723) Major compact should be done when there is only one storefile and some keyvalue is outdated.

2011-04-06 Thread zhoushuaifeng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhoushuaifeng updated HBASE-3723:
-

Attachment: hbase-3723.txt

When there is only one file in the store and some keyvalues are outdated, 
majorcompact should be triggered whatever the file is made by a majorcompact or 
not.

 Major compact should be done when there is only one storefile and some 
 keyvalue is outdated.
 

 Key: HBASE-3723
 URL: https://issues.apache.org/jira/browse/HBASE-3723
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.0, 0.90.1
Reporter: zhoushuaifeng
 Fix For: 0.90.2

 Attachments: hbase-3723.txt


 In the function store.isMajorCompaction:
   if (filesToCompact.size() == 1) {
 // Single file
 StoreFile sf = filesToCompact.get(0);
 long oldest =
 (sf.getReader().timeRangeTracker == null) ?
 Long.MIN_VALUE :
 now - sf.getReader().timeRangeTracker.minimumTimestamp;
 if (sf.isMajorCompaction() 
 (this.ttl == HConstants.FOREVER || oldest  this.ttl)) {
   if (LOG.isDebugEnabled()) {
 LOG.debug(Skipping major compaction of  + this.storeNameStr +
  because one (major) compacted file only and oldestTime  +
 oldest + ms is  ttl= + this.ttl);
   }
 }
   } else {
 When there is only one storefile in the store, and some keyvalues' TTL are 
 overtime, the majorcompactchecker should send this region to the compactquene 
 and run a majorcompact to clean these outdated data. But according to the 
 code in 0.90.1, it will do nothing. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-3722) A lot of data is lost when name node crashed

2011-04-06 Thread gaojinchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

gaojinchao updated HBASE-3722:
--

Attachment: HmasterFilesystem_PatchV1.patch

  A lot of data is lost when name node crashed
 -

 Key: HBASE-3722
 URL: https://issues.apache.org/jira/browse/HBASE-3722
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.1
Reporter: gaojinchao
 Attachments: HmasterFilesystem_PatchV1.patch


 I'm not sure exactly what arose it. there is some split failed logs .
 the master should shutdown itself when the HDFS is crashed.
  The logs is :
  2011-03-22 13:21:55,056 WARN 
  org.apache.hadoop.hbase.master.LogCleaner: Error while cleaning the 
  logs
  java.net.ConnectException: Call to C4C1/157.5.100.1:9000 failed on 
 connection exception: java.net.ConnectException: Connection refused
  at org.apache.hadoop.ipc.Client.wrapException(Client.java:844)
  at org.apache.hadoop.ipc.Client.call(Client.java:820)
  at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:221)
  at $Proxy5.getListing(Unknown Source)
  at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
  at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  at java.lang.reflect.Method.invoke(Method.java:597)
  at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
  at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
  at $Proxy5.getListing(Unknown Source)
  at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:614)
  at 
 org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:252)
  at 
 org.apache.hadoop.hbase.master.LogCleaner.chore(LogCleaner.java:121)
  at org.apache.hadoop.hbase.Chore.run(Chore.java:66)
  at 
  org.apache.hadoop.hbase.master.LogCleaner.run(LogCleaner.java:154)
  Caused by: java.net.ConnectException: Connection refused
  at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
  at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
  at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
  at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:408)
  at 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:332)
  at 
 org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:202)
  at org.apache.hadoop.ipc.Client.getConnection(Client.java:943)
  at org.apache.hadoop.ipc.Client.call(Client.java:788)
  ... 13 more
  2011-03-22 13:21:56,056 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: C4C1/157.5.100.1:9000. Already tried 0 time(s).
  2011-03-22 13:21:57,057 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: C4C1/157.5.100.1:9000. Already tried 1 time(s).
  2011-03-22 13:21:58,057 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: C4C1/157.5.100.1:9000. Already tried 2 time(s).
  2011-03-22 13:21:59,057 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: C4C1/157.5.100.1:9000. Already tried 3 time(s).
  2011-03-22 13:22:00,058 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: C4C1/157.5.100.1:9000. Already tried 4 time(s).
  2011-03-22 13:22:01,058 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: C4C1/157.5.100.1:9000. Already tried 5 time(s).
  2011-03-22 13:22:02,059 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: C4C1/157.5.100.1:9000. Already tried 6 time(s).
  2011-03-22 13:22:03,059 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: C4C1/157.5.100.1:9000. Already tried 7 time(s).
  2011-03-22 13:22:04,059 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: C4C1/157.5.100.1:9000. Already tried 8 time(s).
  2011-03-22 13:22:05,060 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: C4C1/157.5.100.1:9000. Already tried 9 time(s).
  2011-03-22 13:22:05,060 ERROR 
  org.apache.hadoop.hbase.master.MasterFileSystem: Failed splitting 
  hdfs://C4C1:9000/hbase/.logs/C4C9.site,60020,1300767633398
  java.net.ConnectException: Call to C4C1/157.5.100.1:9000 failed on 
 connection exception: java.net.ConnectException: Connection refused
  at org.apache.hadoop.ipc.Client.wrapException(Client.java:844)
  at org.apache.hadoop.ipc.Client.call(Client.java:820)
  at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:221)
  at $Proxy5.getFileInfo(Unknown Source)
  at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
  at 
 

[jira] [Commented] (HBASE-3722) A lot of data is lost when name node crashed

2011-04-06 Thread gaojinchao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016350#comment-13016350
 ] 

gaojinchao commented on HBASE-3722:
---

I try to fix this bug. who do review it?  
thanks

  A lot of data is lost when name node crashed
 -

 Key: HBASE-3722
 URL: https://issues.apache.org/jira/browse/HBASE-3722
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.1
Reporter: gaojinchao
 Attachments: HmasterFilesystem_PatchV1.patch


 I'm not sure exactly what arose it. there is some split failed logs .
 the master should shutdown itself when the HDFS is crashed.
  The logs is :
  2011-03-22 13:21:55,056 WARN 
  org.apache.hadoop.hbase.master.LogCleaner: Error while cleaning the 
  logs
  java.net.ConnectException: Call to C4C1/157.5.100.1:9000 failed on 
 connection exception: java.net.ConnectException: Connection refused
  at org.apache.hadoop.ipc.Client.wrapException(Client.java:844)
  at org.apache.hadoop.ipc.Client.call(Client.java:820)
  at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:221)
  at $Proxy5.getListing(Unknown Source)
  at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
  at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  at java.lang.reflect.Method.invoke(Method.java:597)
  at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
  at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
  at $Proxy5.getListing(Unknown Source)
  at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:614)
  at 
 org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:252)
  at 
 org.apache.hadoop.hbase.master.LogCleaner.chore(LogCleaner.java:121)
  at org.apache.hadoop.hbase.Chore.run(Chore.java:66)
  at 
  org.apache.hadoop.hbase.master.LogCleaner.run(LogCleaner.java:154)
  Caused by: java.net.ConnectException: Connection refused
  at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
  at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
  at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
  at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:408)
  at 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:332)
  at 
 org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:202)
  at org.apache.hadoop.ipc.Client.getConnection(Client.java:943)
  at org.apache.hadoop.ipc.Client.call(Client.java:788)
  ... 13 more
  2011-03-22 13:21:56,056 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: C4C1/157.5.100.1:9000. Already tried 0 time(s).
  2011-03-22 13:21:57,057 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: C4C1/157.5.100.1:9000. Already tried 1 time(s).
  2011-03-22 13:21:58,057 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: C4C1/157.5.100.1:9000. Already tried 2 time(s).
  2011-03-22 13:21:59,057 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: C4C1/157.5.100.1:9000. Already tried 3 time(s).
  2011-03-22 13:22:00,058 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: C4C1/157.5.100.1:9000. Already tried 4 time(s).
  2011-03-22 13:22:01,058 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: C4C1/157.5.100.1:9000. Already tried 5 time(s).
  2011-03-22 13:22:02,059 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: C4C1/157.5.100.1:9000. Already tried 6 time(s).
  2011-03-22 13:22:03,059 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: C4C1/157.5.100.1:9000. Already tried 7 time(s).
  2011-03-22 13:22:04,059 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: C4C1/157.5.100.1:9000. Already tried 8 time(s).
  2011-03-22 13:22:05,060 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: C4C1/157.5.100.1:9000. Already tried 9 time(s).
  2011-03-22 13:22:05,060 ERROR 
  org.apache.hadoop.hbase.master.MasterFileSystem: Failed splitting 
  hdfs://C4C1:9000/hbase/.logs/C4C9.site,60020,1300767633398
  java.net.ConnectException: Call to C4C1/157.5.100.1:9000 failed on 
 connection exception: java.net.ConnectException: Connection refused
  at org.apache.hadoop.ipc.Client.wrapException(Client.java:844)
  at org.apache.hadoop.ipc.Client.call(Client.java:820)
  at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:221)
  at $Proxy5.getFileInfo(Unknown Source)
  at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
  at 
 

[jira] [Updated] (HBASE-3710) Book.xml - fill out descriptions of metrics

2011-04-06 Thread Doug Meil (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-3710:
-

Attachment: book.xml.patch

 Book.xml - fill out descriptions of metrics
 ---

 Key: HBASE-3710
 URL: https://issues.apache.org/jira/browse/HBASE-3710
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: book.xml.patch


 I filled out the skeleton of the metrics in book.xml, but I'd like one of the 
 committers to fill in the rest of the details on what these mean.  Thanks! :-)
 For example..
 ---
 I'm assuming that these are referring to the LRU block cache in memory.  I'd 
 like to docs to state this (for the sake of clarity) and also the units 
 (e.g., MB).
 hbase.regionserver.blockCacheCount
 hbase.regionserver.blockCacheFree
 hbase.regionserver.blockCacheHitRatio
 hbase.regionserver.blockCacheSize
 ---
 This is read latency from HDFS, I assume...
 hbase.regionserver.fsReadLatency_avg_time
 hbase.regionserver.fsReadLatency_num_ops
 hbase.regionserver.fsSyncLatency_avg_time
 hbase.regionserver.fsSyncLatency_num_ops
 hbase.regionserver.fsWriteLatency_avg_time
 hbase.regionserver.fsWriteLatency_num_ops
 
 point in time utilized (i.e., as opposed to max or trailing) memstore I 
 assume, would be nice to document.
 hbase.regionserver.memstoreSizeMB
 ---
 obvious, but might as well document it
 hbase.regionserver.regions
 --
 This is any Put, Get, Delete, or Scan operation, I assume?
 hbase.regionserver.requests
 --
 detail on these would be nice, especially for tips on if there are any 
 critical numbers/ratios to watch for.
 hbase.regionserver.storeFileIndexSizeMB
 hbase.regionserver.stores

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-3710) Book.xml - fill out descriptions of metrics

2011-04-06 Thread Doug Meil (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-3710:
-

Status: Patch Available  (was: Open)

 Book.xml - fill out descriptions of metrics
 ---

 Key: HBASE-3710
 URL: https://issues.apache.org/jira/browse/HBASE-3710
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: book.xml.patch


 I filled out the skeleton of the metrics in book.xml, but I'd like one of the 
 committers to fill in the rest of the details on what these mean.  Thanks! :-)
 For example..
 ---
 I'm assuming that these are referring to the LRU block cache in memory.  I'd 
 like to docs to state this (for the sake of clarity) and also the units 
 (e.g., MB).
 hbase.regionserver.blockCacheCount
 hbase.regionserver.blockCacheFree
 hbase.regionserver.blockCacheHitRatio
 hbase.regionserver.blockCacheSize
 ---
 This is read latency from HDFS, I assume...
 hbase.regionserver.fsReadLatency_avg_time
 hbase.regionserver.fsReadLatency_num_ops
 hbase.regionserver.fsSyncLatency_avg_time
 hbase.regionserver.fsSyncLatency_num_ops
 hbase.regionserver.fsWriteLatency_avg_time
 hbase.regionserver.fsWriteLatency_num_ops
 
 point in time utilized (i.e., as opposed to max or trailing) memstore I 
 assume, would be nice to document.
 hbase.regionserver.memstoreSizeMB
 ---
 obvious, but might as well document it
 hbase.regionserver.regions
 --
 This is any Put, Get, Delete, or Scan operation, I assume?
 hbase.regionserver.requests
 --
 detail on these would be nice, especially for tips on if there are any 
 critical numbers/ratios to watch for.
 hbase.regionserver.storeFileIndexSizeMB
 hbase.regionserver.stores

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-3710) Book.xml - fill out descriptions of metrics

2011-04-06 Thread Doug Meil (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-3710:
-

Attachment: (was: book.xml.patch)

 Book.xml - fill out descriptions of metrics
 ---

 Key: HBASE-3710
 URL: https://issues.apache.org/jira/browse/HBASE-3710
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: book.xml.patch


 I filled out the skeleton of the metrics in book.xml, but I'd like one of the 
 committers to fill in the rest of the details on what these mean.  Thanks! :-)
 For example..
 ---
 I'm assuming that these are referring to the LRU block cache in memory.  I'd 
 like to docs to state this (for the sake of clarity) and also the units 
 (e.g., MB).
 hbase.regionserver.blockCacheCount
 hbase.regionserver.blockCacheFree
 hbase.regionserver.blockCacheHitRatio
 hbase.regionserver.blockCacheSize
 ---
 This is read latency from HDFS, I assume...
 hbase.regionserver.fsReadLatency_avg_time
 hbase.regionserver.fsReadLatency_num_ops
 hbase.regionserver.fsSyncLatency_avg_time
 hbase.regionserver.fsSyncLatency_num_ops
 hbase.regionserver.fsWriteLatency_avg_time
 hbase.regionserver.fsWriteLatency_num_ops
 
 point in time utilized (i.e., as opposed to max or trailing) memstore I 
 assume, would be nice to document.
 hbase.regionserver.memstoreSizeMB
 ---
 obvious, but might as well document it
 hbase.regionserver.regions
 --
 This is any Put, Get, Delete, or Scan operation, I assume?
 hbase.regionserver.requests
 --
 detail on these would be nice, especially for tips on if there are any 
 critical numbers/ratios to watch for.
 hbase.regionserver.storeFileIndexSizeMB
 hbase.regionserver.stores

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-3710) Book.xml - fill out descriptions of metrics

2011-04-06 Thread Doug Meil (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Doug Meil updated HBASE-3710:
-

Attachment: book.xml.patch

 Book.xml - fill out descriptions of metrics
 ---

 Key: HBASE-3710
 URL: https://issues.apache.org/jira/browse/HBASE-3710
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Attachments: book.xml.patch


 I filled out the skeleton of the metrics in book.xml, but I'd like one of the 
 committers to fill in the rest of the details on what these mean.  Thanks! :-)
 For example..
 ---
 I'm assuming that these are referring to the LRU block cache in memory.  I'd 
 like to docs to state this (for the sake of clarity) and also the units 
 (e.g., MB).
 hbase.regionserver.blockCacheCount
 hbase.regionserver.blockCacheFree
 hbase.regionserver.blockCacheHitRatio
 hbase.regionserver.blockCacheSize
 ---
 This is read latency from HDFS, I assume...
 hbase.regionserver.fsReadLatency_avg_time
 hbase.regionserver.fsReadLatency_num_ops
 hbase.regionserver.fsSyncLatency_avg_time
 hbase.regionserver.fsSyncLatency_num_ops
 hbase.regionserver.fsWriteLatency_avg_time
 hbase.regionserver.fsWriteLatency_num_ops
 
 point in time utilized (i.e., as opposed to max or trailing) memstore I 
 assume, would be nice to document.
 hbase.regionserver.memstoreSizeMB
 ---
 obvious, but might as well document it
 hbase.regionserver.regions
 --
 This is any Put, Get, Delete, or Scan operation, I assume?
 hbase.regionserver.requests
 --
 detail on these would be nice, especially for tips on if there are any 
 critical numbers/ratios to watch for.
 hbase.regionserver.storeFileIndexSizeMB
 hbase.regionserver.stores

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-1512) Coprocessors: Support aggregate functions

2011-04-06 Thread Himanshu Vashishtha (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016496#comment-13016496
 ] 

Himanshu Vashishtha commented on HBASE-1512:


Thanks for review Ted. 

What I think in the divide method is returning Double.NaN if either of operand 
is null. Any operation on null should give null.

Ok for the name refactoring.

I don't have any strong feeling for making a separate class out if it at this 
point of time, as it doesn't add much on its own. But will do it if you say so.

 Coprocessors: Support aggregate functions
 -

 Key: HBASE-1512
 URL: https://issues.apache.org/jira/browse/HBASE-1512
 Project: HBase
  Issue Type: Sub-task
  Components: coprocessors
Reporter: stack
 Attachments: 1512.zip, AggregateCpProtocol.java, 
 AggregateProtocolImpl.java, AggregationClient.java, ColumnInterpreter.java, 
 patch-1512-2.txt, patch-1512-3.txt, patch-1512.txt


 Chatting with jgray and holstad at the kitchen table about counts, sums, and 
 other aggregating facility, facility generally where you want to calculate 
 some meta info on your table, it seems like it wouldn't be too hard making a 
 filter type that could run a function server-side and return the result ONLY 
 of the aggregation or whatever.
 For example, say you just want to count rows, currently you scan, server 
 returns all data to client and count is done by client counting up row keys.  
 A bunch of time and resources have been wasted returning data that we're not 
 interested in.  With this new filter type, the counting would be done 
 server-side and then it would make up a new result that was the count only 
 (kinda like mysql when you ask it to count, it returns a 'table' with a count 
 column whose value is count of rows).   We could have it so the count was 
 just done per region and return that.  Or we could maybe make a small change 
 in scanner too so that it aggregated the per-region counts.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-3743) Throttle major compaction

2011-04-06 Thread Joep Rottinghuis (JIRA)
Throttle major compaction
-

 Key: HBASE-3743
 URL: https://issues.apache.org/jira/browse/HBASE-3743
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Joep Rottinghuis


Add the ability to throttle major compaction.
For those use cases when a stop-the-world approach is not practical, it is 
useful to be able to throttle the impact that major compaction has on the 
cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-1512) Coprocessors: Support aggregate functions

2011-04-06 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016519#comment-13016519
 ] 

Ted Yu commented on HBASE-1512:
---

I think returning Double.NaN is fine. Normally either of operand being null 
would lead to NPE.
For making a separate class, it would be easier for users to produce other 
ColumnInterpreter classes based on LongColumnInterpreter.

 Coprocessors: Support aggregate functions
 -

 Key: HBASE-1512
 URL: https://issues.apache.org/jira/browse/HBASE-1512
 Project: HBase
  Issue Type: Sub-task
  Components: coprocessors
Reporter: stack
 Attachments: 1512.zip, AggregateCpProtocol.java, 
 AggregateProtocolImpl.java, AggregationClient.java, ColumnInterpreter.java, 
 patch-1512-2.txt, patch-1512-3.txt, patch-1512.txt


 Chatting with jgray and holstad at the kitchen table about counts, sums, and 
 other aggregating facility, facility generally where you want to calculate 
 some meta info on your table, it seems like it wouldn't be too hard making a 
 filter type that could run a function server-side and return the result ONLY 
 of the aggregation or whatever.
 For example, say you just want to count rows, currently you scan, server 
 returns all data to client and count is done by client counting up row keys.  
 A bunch of time and resources have been wasted returning data that we're not 
 interested in.  With this new filter type, the counting would be done 
 server-side and then it would make up a new result that was the count only 
 (kinda like mysql when you ask it to count, it returns a 'table' with a count 
 column whose value is count of rows).   We could have it so the count was 
 just done per region and return that.  Or we could maybe make a small change 
 in scanner too so that it aggregated the per-region counts.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-3744) createTable blocks until all regions are out of transition

2011-04-06 Thread Todd Lipcon (JIRA)
createTable blocks until all regions are out of transition
--

 Key: HBASE-3744
 URL: https://issues.apache.org/jira/browse/HBASE-3744
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.1
Reporter: Todd Lipcon
 Fix For: 0.92.0


In HBASE-3305, the behavior of createTable was changed and introduced this bug: 
createTable now blocks until all regions have been assigned, since it uses 
BulkStartupAssigner. BulkStartupAssigner.waitUntilDone calls 
assignmentManager.waitUntilNoRegionsInTransition, which waits across all 
regions, not just the regions of the table that has just been created.

We saw an issue where one table had a region which was unable to be opened, so 
it was stuck in RegionsInTransition permanently (every open was failing). Since 
this was the case, waitUntilDone would always block indefinitely even though 
the newly created table had been assigned.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-3745) Add the ability to restrict major-compactible files by timestamp

2011-04-06 Thread Todd Lipcon (JIRA)
Add the ability to restrict major-compactible files by timestamp


 Key: HBASE-3745
 URL: https://issues.apache.org/jira/browse/HBASE-3745
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.92.0
Reporter: Todd Lipcon


In some applications, a common access pattern is to frequently scan tables with 
a time range predicate restricted to a fairly recent time window. For example, 
you may want to do an incremental aggregation or indexing step only on rows 
that have changed in the last hour. We do this efficiently by tracking min and 
max timestamp on an HFile level, so that old HFiles don't have to be read.

After a major compaction, however, the entire dataset will need to be read, 
which can hurt performance of this access pattern.

We should add a column family attribute that can specify a policy like: When 
major compacting, never include an HFile that contains data with a timestamp in 
the last 4 hours. This, recently flushed HFiles will always be uncompacted and 
provide the good scan performance required for these applications.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-3744) createTable blocks until all regions are out of transition

2011-04-06 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-3744:
--

Attachment: 3744.txt

First attempt for this issue.
I don't see sync parameter for createTable() in HMasterInterface
TestAdmin passes.

 createTable blocks until all regions are out of transition
 --

 Key: HBASE-3744
 URL: https://issues.apache.org/jira/browse/HBASE-3744
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.1
Reporter: Todd Lipcon
Assignee: Ted Yu
 Fix For: 0.92.0

 Attachments: 3744.txt


 In HBASE-3305, the behavior of createTable was changed and introduced this 
 bug: createTable now blocks until all regions have been assigned, since it 
 uses BulkStartupAssigner. BulkStartupAssigner.waitUntilDone calls 
 assignmentManager.waitUntilNoRegionsInTransition, which waits across all 
 regions, not just the regions of the table that has just been created.
 We saw an issue where one table had a region which was unable to be opened, 
 so it was stuck in RegionsInTransition permanently (every open was failing). 
 Since this was the case, waitUntilDone would always block indefinitely even 
 though the newly created table had been assigned.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Work started] (HBASE-3744) createTable blocks until all regions are out of transition

2011-04-06 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-3744 started by Ted Yu.

 createTable blocks until all regions are out of transition
 --

 Key: HBASE-3744
 URL: https://issues.apache.org/jira/browse/HBASE-3744
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.1
Reporter: Todd Lipcon
Assignee: Ted Yu
 Fix For: 0.92.0

 Attachments: 3744.txt


 In HBASE-3305, the behavior of createTable was changed and introduced this 
 bug: createTable now blocks until all regions have been assigned, since it 
 uses BulkStartupAssigner. BulkStartupAssigner.waitUntilDone calls 
 assignmentManager.waitUntilNoRegionsInTransition, which waits across all 
 regions, not just the regions of the table that has just been created.
 We saw an issue where one table had a region which was unable to be opened, 
 so it was stuck in RegionsInTransition permanently (every open was failing). 
 Since this was the case, waitUntilDone would always block indefinitely even 
 though the newly created table had been assigned.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3587) Eliminate use of ReadWriteLock in RegionObserver coprocessor invocation

2011-04-06 Thread Gary Helmling (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016566#comment-13016566
 ] 

Gary Helmling commented on HBASE-3587:
--

Posted patch for review at: https://review.cloudera.org/r/1681/

 Eliminate use of ReadWriteLock in RegionObserver coprocessor invocation
 ---

 Key: HBASE-3587
 URL: https://issues.apache.org/jira/browse/HBASE-3587
 Project: HBase
  Issue Type: Improvement
  Components: coprocessors
Reporter: Gary Helmling
Assignee: Gary Helmling

 Follow-up to a discussion on the dev list: 
 http://search-hadoop.com/m/jOovV1uAJBP
 The CoprocessorHost ReentrantReadWriteLock is imposing some overhead on data 
 read/write operations, even when no coprocessors are loaded.  Currently 
 execution of RegionCoprocessorHost pre/postXXX() methods are guarded by 
 acquiring the coprocessor read lock.  This is used to prevent coprocessor 
 registration from modifying the coprocessor collection while upcall hooks are 
 in progress.
 On further discussion, and looking at the locking in HRegion, it should be 
 sufficient to just use a CopyOnWriteArrayList for the coprocessor collection. 
  We can then remove the coprocessor lock and eliminate the associated 
 overhead without having to special case the no loaded coprocessors 
 condition.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-3746) Clean up CompressionTest to not directly reference DistributedFileSystem

2011-04-06 Thread Todd Lipcon (JIRA)
Clean up CompressionTest to not directly reference DistributedFileSystem


 Key: HBASE-3746
 URL: https://issues.apache.org/jira/browse/HBASE-3746
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.1
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.90.3


Right now, CompressionTest has a number of issues:
- it always writes to the home directory of the user, regardless of the path 
provided
- it requires actually writing to HDFS when a local file is probably sufficient

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3744) createTable blocks until all regions are out of transition

2011-04-06 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016577#comment-13016577
 ] 

Ted Yu commented on HBASE-3744:
---

From IRC:
tlipcon_: it seems wrong that we use something called BukStartupAssigner in the 
first place
[2:54pm] tlipcon_: since this is not startup
...
tlipcon_: i think BulkStartupAssigner should be renamed
[3:11pm] tlipcon_: and then either a parameter or a different subclass that 
doesn't wait

 createTable blocks until all regions are out of transition
 --

 Key: HBASE-3744
 URL: https://issues.apache.org/jira/browse/HBASE-3744
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.1
Reporter: Todd Lipcon
Assignee: Ted Yu
 Fix For: 0.92.0

 Attachments: 3744.txt


 In HBASE-3305, the behavior of createTable was changed and introduced this 
 bug: createTable now blocks until all regions have been assigned, since it 
 uses BulkStartupAssigner. BulkStartupAssigner.waitUntilDone calls 
 assignmentManager.waitUntilNoRegionsInTransition, which waits across all 
 regions, not just the regions of the table that has just been created.
 We saw an issue where one table had a region which was unable to be opened, 
 so it was stuck in RegionsInTransition permanently (every open was failing). 
 Since this was the case, waitUntilDone would always block indefinitely even 
 though the newly created table had been assigned.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3744) createTable blocks until all regions are out of transition

2011-04-06 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016583#comment-13016583
 ] 

Ted Yu commented on HBASE-3744:
---

Currently we have:
{code}
  static class BulkStartupAssigner extends BulkAssigner {
{code}
We also have:
{code}
  class BulkDisabler extends BulkAssigner {
{code}
If we rename BulkAssigner as AbstractBulkAssigner, we can rename 
BulkStartupAssigner as BulkAssigner

 createTable blocks until all regions are out of transition
 --

 Key: HBASE-3744
 URL: https://issues.apache.org/jira/browse/HBASE-3744
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.1
Reporter: Todd Lipcon
Assignee: Ted Yu
 Fix For: 0.92.0

 Attachments: 3744.txt


 In HBASE-3305, the behavior of createTable was changed and introduced this 
 bug: createTable now blocks until all regions have been assigned, since it 
 uses BulkStartupAssigner. BulkStartupAssigner.waitUntilDone calls 
 assignmentManager.waitUntilNoRegionsInTransition, which waits across all 
 regions, not just the regions of the table that has just been created.
 We saw an issue where one table had a region which was unable to be opened, 
 so it was stuck in RegionsInTransition permanently (every open was failing). 
 Since this was the case, waitUntilDone would always block indefinitely even 
 though the newly created table had been assigned.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-1512) Coprocessors: Support aggregate functions

2011-04-06 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016585#comment-13016585
 ] 

Ted Yu commented on HBASE-1512:
---

Version 4 is awesome.

 Coprocessors: Support aggregate functions
 -

 Key: HBASE-1512
 URL: https://issues.apache.org/jira/browse/HBASE-1512
 Project: HBase
  Issue Type: Sub-task
  Components: coprocessors
Reporter: stack
 Attachments: 1512.zip, AggregateCpProtocol.java, 
 AggregateProtocolImpl.java, AggregationClient.java, ColumnInterpreter.java, 
 patch-1512-2.txt, patch-1512-3.txt, patch-1512-4.txt, patch-1512.txt


 Chatting with jgray and holstad at the kitchen table about counts, sums, and 
 other aggregating facility, facility generally where you want to calculate 
 some meta info on your table, it seems like it wouldn't be too hard making a 
 filter type that could run a function server-side and return the result ONLY 
 of the aggregation or whatever.
 For example, say you just want to count rows, currently you scan, server 
 returns all data to client and count is done by client counting up row keys.  
 A bunch of time and resources have been wasted returning data that we're not 
 interested in.  With this new filter type, the counting would be done 
 server-side and then it would make up a new result that was the count only 
 (kinda like mysql when you ask it to count, it returns a 'table' with a count 
 column whose value is count of rows).   We could have it so the count was 
 just done per region and return that.  Or we could maybe make a small change 
 in scanner too so that it aggregated the per-region counts.  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-3747) [replication] ReplicationSource should differanciate remote and local exceptions

2011-04-06 Thread Jean-Daniel Cryans (JIRA)
[replication] ReplicationSource should differanciate remote and local exceptions


 Key: HBASE-3747
 URL: https://issues.apache.org/jira/browse/HBASE-3747
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.1
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Minor
 Fix For: 0.90.3


From Jeff Whiting on the list:

I'm not sure...the key to everything was realizing picking up on 
RemoteException in Unable to replicate because 
org.apache.hadoop.ipc.RemoteException and realizing that it was on the 
replication cluster.

If it said something like Unable to replicate. Destination cluster threw an 
exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
java.lang.RuntimeException:

Or something that makes it clear it is an exception on the remote or 
destination cluster would be helpful.  It is easy to scan over 
org.apache.hadoop.ipc.RemoteException and read it like 
org.ap...some-kind-of...exception.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-3748) Add rolling of thrift/rest daemons to graceful_stop.sh script

2011-04-06 Thread stack (JIRA)
Add rolling of thrift/rest daemons to graceful_stop.sh script
-

 Key: HBASE-3748
 URL: https://issues.apache.org/jira/browse/HBASE-3748
 Project: HBase
  Issue Type: Task
Affects Versions: 0.92.0
Reporter: stack
Assignee: stack


Add option to stop/start thrift and rest servers as part of a rolling a server.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-3748) Add rolling of thrift/rest daemons to graceful_stop.sh script

2011-04-06 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-3748.
--

   Resolution: Fixed
Fix Version/s: 0.92.0

Applied to trunk and branch... and added some more to the decommission section 
in the book.

 Add rolling of thrift/rest daemons to graceful_stop.sh script
 -

 Key: HBASE-3748
 URL: https://issues.apache.org/jira/browse/HBASE-3748
 Project: HBase
  Issue Type: Task
Affects Versions: 0.92.0
Reporter: stack
Assignee: stack
 Fix For: 0.92.0


 Add option to stop/start thrift and rest servers as part of a rolling a 
 server.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3746) Clean up CompressionTest to not directly reference DistributedFileSystem

2011-04-06 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016605#comment-13016605
 ] 

stack commented on HBASE-3746:
--

+1  Nice.

 Clean up CompressionTest to not directly reference DistributedFileSystem
 

 Key: HBASE-3746
 URL: https://issues.apache.org/jira/browse/HBASE-3746
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.1
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.90.3

 Attachments: hbase-3746.txt


 Right now, CompressionTest has a number of issues:
 - it always writes to the home directory of the user, regardless of the path 
 provided
 - it requires actually writing to HDFS when a local file is probably 
 sufficient

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3587) Eliminate use of ReadWriteLock in RegionObserver coprocessor invocation

2011-04-06 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016607#comment-13016607
 ] 

stack commented on HBASE-3587:
--

+1

Nice one Gary.

 Eliminate use of ReadWriteLock in RegionObserver coprocessor invocation
 ---

 Key: HBASE-3587
 URL: https://issues.apache.org/jira/browse/HBASE-3587
 Project: HBase
  Issue Type: Improvement
  Components: coprocessors
Reporter: Gary Helmling
Assignee: Gary Helmling

 Follow-up to a discussion on the dev list: 
 http://search-hadoop.com/m/jOovV1uAJBP
 The CoprocessorHost ReentrantReadWriteLock is imposing some overhead on data 
 read/write operations, even when no coprocessors are loaded.  Currently 
 execution of RegionCoprocessorHost pre/postXXX() methods are guarded by 
 acquiring the coprocessor read lock.  This is used to prevent coprocessor 
 registration from modifying the coprocessor collection while upcall hooks are 
 in progress.
 On further discussion, and looking at the locking in HRegion, it should be 
 sufficient to just use a CopyOnWriteArrayList for the coprocessor collection. 
  We can then remove the coprocessor lock and eliminate the associated 
 overhead without having to special case the no loaded coprocessors 
 condition.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-3747) [replication] ReplicationSource should differanciate remote and local exceptions

2011-04-06 Thread Jean-Daniel Cryans (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Daniel Cryans updated HBASE-3747:
--

Attachment: HBASE-3747.patch

Patch that adds better logging of the issue and adds some refactoring for 
RemoteException handling.

 [replication] ReplicationSource should differanciate remote and local 
 exceptions
 

 Key: HBASE-3747
 URL: https://issues.apache.org/jira/browse/HBASE-3747
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.1
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Minor
 Fix For: 0.90.3

 Attachments: HBASE-3747.patch


 From Jeff Whiting on the list:
 I'm not sure...the key to everything was realizing picking up on 
 RemoteException in Unable to replicate because 
 org.apache.hadoop.ipc.RemoteException and realizing that it was on the 
 replication cluster.
 If it said something like Unable to replicate. Destination cluster threw an 
 exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
 java.lang.RuntimeException:
 Or something that makes it clear it is an exception on the remote or 
 destination cluster would be helpful.  It is easy to scan over 
 org.apache.hadoop.ipc.RemoteException and read it like 
 org.ap...some-kind-of...exception.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-3734) HBaseAdmin creates new configurations in getCatalogTracker

2011-04-06 Thread Jean-Daniel Cryans (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Daniel Cryans updated HBASE-3734:
--

Attachment: HBASE-3734.patch

Patch that adds the copy in HBA's constructor and removes the copy in 
getCatalogTracker, passes unit tests.

 HBaseAdmin creates new configurations in getCatalogTracker
 --

 Key: HBASE-3734
 URL: https://issues.apache.org/jira/browse/HBASE-3734
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.1
Reporter: Jean-Daniel Cryans
 Fix For: 0.90.3

 Attachments: HBASE-3734.patch


 HBaseAdmin.getCatalogTracker creates new Configuration every time it's 
 called, instead HBA should reuse the same one and do the copy inside the 
 constructor.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-3710) Book.xml - fill out descriptions of metrics

2011-04-06 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-3710:
-

   Resolution: Fixed
Fix Version/s: 0.92.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

Committed to TRUNK.  Thanks for patch Doug.  The TODOs are fine.

 Book.xml - fill out descriptions of metrics
 ---

 Key: HBASE-3710
 URL: https://issues.apache.org/jira/browse/HBASE-3710
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Fix For: 0.92.0

 Attachments: book.xml.patch


 I filled out the skeleton of the metrics in book.xml, but I'd like one of the 
 committers to fill in the rest of the details on what these mean.  Thanks! :-)
 For example..
 ---
 I'm assuming that these are referring to the LRU block cache in memory.  I'd 
 like to docs to state this (for the sake of clarity) and also the units 
 (e.g., MB).
 hbase.regionserver.blockCacheCount
 hbase.regionserver.blockCacheFree
 hbase.regionserver.blockCacheHitRatio
 hbase.regionserver.blockCacheSize
 ---
 This is read latency from HDFS, I assume...
 hbase.regionserver.fsReadLatency_avg_time
 hbase.regionserver.fsReadLatency_num_ops
 hbase.regionserver.fsSyncLatency_avg_time
 hbase.regionserver.fsSyncLatency_num_ops
 hbase.regionserver.fsWriteLatency_avg_time
 hbase.regionserver.fsWriteLatency_num_ops
 
 point in time utilized (i.e., as opposed to max or trailing) memstore I 
 assume, would be nice to document.
 hbase.regionserver.memstoreSizeMB
 ---
 obvious, but might as well document it
 hbase.regionserver.regions
 --
 This is any Put, Get, Delete, or Scan operation, I assume?
 hbase.regionserver.requests
 --
 detail on these would be nice, especially for tips on if there are any 
 critical numbers/ratios to watch for.
 hbase.regionserver.storeFileIndexSizeMB
 hbase.regionserver.stores

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-3738) Book.xml - expanding Architecture Client section

2011-04-06 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-3738:
-

   Resolution: Fixed
Fix Version/s: 0.92.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

Committed to TRUNK.  Thanks for the patch Doug.

In future, try exorcising tabs, do ~80 characters a line, and name your patch 
for the issue number instead of calling it book.xml.patch each time.  Thanks 
Doug.  Our book is coming along nicely.

 Book.xml - expanding Architecture Client section
 

 Key: HBASE-3738
 URL: https://issues.apache.org/jira/browse/HBASE-3738
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Fix For: 0.92.0

 Attachments: book.xml.patch


 Expanded the Architecture Client section.  Broke 'connection' into 
 sub-section, and created 'writebuffer and batch methods' into another 
 sub-section.
 Both seem to be fairly frequent questions on the dist-list.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-3746) Clean up CompressionTest to not directly reference DistributedFileSystem

2011-04-06 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HBASE-3746:
---

Attachment: hbase-3746.txt

Slightly updated patch to make it exit after calling usage(), and proper 
example as file:///tmp instead of file://tmp

Committing to trunk and branch.

 Clean up CompressionTest to not directly reference DistributedFileSystem
 

 Key: HBASE-3746
 URL: https://issues.apache.org/jira/browse/HBASE-3746
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.1
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.90.3

 Attachments: hbase-3746.txt, hbase-3746.txt


 Right now, CompressionTest has a number of issues:
 - it always writes to the home directory of the user, regardless of the path 
 provided
 - it requires actually writing to HDFS when a local file is probably 
 sufficient

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3738) Book.xml - expanding Architecture Client section

2011-04-06 Thread Doug Meil (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016620#comment-13016620
 ] 

Doug Meil commented on HBASE-3738:
--

Roger that!  Sorry about the tabs.  For some reason I thought I was helping 
by using 'book.xml.patch' every time.  It makes sense to name it for the edit.

 Book.xml - expanding Architecture Client section
 

 Key: HBASE-3738
 URL: https://issues.apache.org/jira/browse/HBASE-3738
 Project: HBase
  Issue Type: Improvement
Reporter: Doug Meil
Assignee: Doug Meil
Priority: Minor
 Fix For: 0.92.0

 Attachments: book.xml.patch


 Expanded the Architecture Client section.  Broke 'connection' into 
 sub-section, and created 'writebuffer and batch methods' into another 
 sub-section.
 Both seem to be fairly frequent questions on the dist-list.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-3746) Clean up CompressionTest to not directly reference DistributedFileSystem

2011-04-06 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HBASE-3746:
---

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

 Clean up CompressionTest to not directly reference DistributedFileSystem
 

 Key: HBASE-3746
 URL: https://issues.apache.org/jira/browse/HBASE-3746
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.1
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 0.90.3

 Attachments: hbase-3746.txt, hbase-3746.txt


 Right now, CompressionTest has a number of issues:
 - it always writes to the home directory of the user, regardless of the path 
 provided
 - it requires actually writing to HDFS when a local file is probably 
 sufficient

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-3587) Eliminate use of ReadWriteLock in RegionObserver coprocessor invocation

2011-04-06 Thread Gary Helmling (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Helmling updated HBASE-3587:
-

Attachment: HBASE-3587.patch

Patch committed to trunk

 Eliminate use of ReadWriteLock in RegionObserver coprocessor invocation
 ---

 Key: HBASE-3587
 URL: https://issues.apache.org/jira/browse/HBASE-3587
 Project: HBase
  Issue Type: Improvement
  Components: coprocessors
Reporter: Gary Helmling
Assignee: Gary Helmling
 Fix For: 0.92.0

 Attachments: HBASE-3587.patch


 Follow-up to a discussion on the dev list: 
 http://search-hadoop.com/m/jOovV1uAJBP
 The CoprocessorHost ReentrantReadWriteLock is imposing some overhead on data 
 read/write operations, even when no coprocessors are loaded.  Currently 
 execution of RegionCoprocessorHost pre/postXXX() methods are guarded by 
 acquiring the coprocessor read lock.  This is used to prevent coprocessor 
 registration from modifying the coprocessor collection while upcall hooks are 
 in progress.
 On further discussion, and looking at the locking in HRegion, it should be 
 sufficient to just use a CopyOnWriteArrayList for the coprocessor collection. 
  We can then remove the coprocessor lock and eliminate the associated 
 overhead without having to special case the no loaded coprocessors 
 condition.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-3587) Eliminate use of ReadWriteLock in RegionObserver coprocessor invocation

2011-04-06 Thread Gary Helmling (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Helmling resolved HBASE-3587.
--

   Resolution: Fixed
Fix Version/s: 0.92.0

Committed to trunk

 Eliminate use of ReadWriteLock in RegionObserver coprocessor invocation
 ---

 Key: HBASE-3587
 URL: https://issues.apache.org/jira/browse/HBASE-3587
 Project: HBase
  Issue Type: Improvement
  Components: coprocessors
Reporter: Gary Helmling
Assignee: Gary Helmling
 Fix For: 0.92.0

 Attachments: HBASE-3587.patch


 Follow-up to a discussion on the dev list: 
 http://search-hadoop.com/m/jOovV1uAJBP
 The CoprocessorHost ReentrantReadWriteLock is imposing some overhead on data 
 read/write operations, even when no coprocessors are loaded.  Currently 
 execution of RegionCoprocessorHost pre/postXXX() methods are guarded by 
 acquiring the coprocessor read lock.  This is used to prevent coprocessor 
 registration from modifying the coprocessor collection while upcall hooks are 
 in progress.
 On further discussion, and looking at the locking in HRegion, it should be 
 sufficient to just use a CopyOnWriteArrayList for the coprocessor collection. 
  We can then remove the coprocessor lock and eliminate the associated 
 overhead without having to special case the no loaded coprocessors 
 condition.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-3747) [replication] ReplicationSource should differanciate remote and local exceptions

2011-04-06 Thread Jean-Daniel Cryans (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Daniel Cryans resolved HBASE-3747.
---

Resolution: Fixed

Committed to branch and trunk.

 [replication] ReplicationSource should differanciate remote and local 
 exceptions
 

 Key: HBASE-3747
 URL: https://issues.apache.org/jira/browse/HBASE-3747
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.1
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Minor
 Fix For: 0.90.3

 Attachments: HBASE-3747.patch


 From Jeff Whiting on the list:
 I'm not sure...the key to everything was realizing picking up on 
 RemoteException in Unable to replicate because 
 org.apache.hadoop.ipc.RemoteException and realizing that it was on the 
 replication cluster.
 If it said something like Unable to replicate. Destination cluster threw an 
 exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
 java.lang.RuntimeException:
 Or something that makes it clear it is an exception on the remote or 
 destination cluster would be helpful.  It is easy to scan over 
 org.apache.hadoop.ipc.RemoteException and read it like 
 org.ap...some-kind-of...exception.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HBASE-3734) HBaseAdmin creates new configurations in getCatalogTracker

2011-04-06 Thread Jean-Daniel Cryans (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Daniel Cryans reassigned HBASE-3734:
-

Assignee: Jean-Daniel Cryans

 HBaseAdmin creates new configurations in getCatalogTracker
 --

 Key: HBASE-3734
 URL: https://issues.apache.org/jira/browse/HBASE-3734
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.1
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
 Fix For: 0.90.3

 Attachments: HBASE-3734.patch


 HBaseAdmin.getCatalogTracker creates new Configuration every time it's 
 called, instead HBA should reuse the same one and do the copy inside the 
 constructor.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-3749) Master can't exit when open port failed

2011-04-06 Thread gaojinchao (JIRA)
Master can't exit when open port failed
---

 Key: HBASE-3749
 URL: https://issues.apache.org/jira/browse/HBASE-3749
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.1
Reporter: gaojinchao


When Hmaster crashed  and restart , The Hmaster is hung up.

// start up all service threads.
startServiceThreads();  this open port 
failed!

// Wait for region servers to report in.  Returns count of regions.
int regionCount = this.serverManager.waitForRegionServers();

// TODO: Should do this in background rather than block master startup
this.fileSystemManager.
  splitLogAfterStartup(this.serverManager.getOnlineServers());

// Make sure root and meta assigned before proceeding.
assignRootAndMeta();   --- hung up this 
function, because of root can't be assigned.

  if (!catalogTracker.verifyRootRegionLocation(timeout)) {
  this.assignmentManager.assignRoot();
  this.catalogTracker.waitForRoot();   --- This statement 
code is hung up. 
  assigned++;
}

Log is as:

2011-04-07 16:38:22,850 INFO org.mortbay.log: Logging to 
org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2011-04-07 16:38:22,908 INFO org.apache.hadoop.http.HttpServer: Port returned 
by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the 
listener on 60010
2011-04-07 16:38:22,909 FATAL org.apache.hadoop.hbase.master.HMaster: Failed 
startup
java.net.BindException: Address already in use
 at sun.nio.ch.Net.bind(Native Method)
 at 
sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119)
 at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)
 at 
org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
 at org.apache.hadoop.http.HttpServer.start(HttpServer.java:445)
 at 
org.apache.hadoop.hbase.master.HMaster.startServiceThreads(HMaster.java:542)
 at 
org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:373)
 at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:278)
2011-04-07 16:38:22,910 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
2011-04-07 16:38:22,911 INFO org.apache.hadoop.hbase.master.ServerManager: 
Exiting wait on regionserver(s) to checkin; count=0, stopped=true, count of 
regions out on cluster=0
2011-04-07 16:38:22,914 DEBUG org.apache.hadoop.hbase.master.MasterFileSystem: 
No log files to split, proceeding...
2011-04-07 16:38:22,930 INFO org.apache.hadoop.ipc.HbaseRPC: Server at 
167-6-1-12/167.6.1.12:60020 could not be reached after 1 tries, giving up.
2011-04-07 16:38:22,930 INFO 
org.apache.hadoop.hbase.catalog.RootLocationEditor: Unsetting ROOT region 
location in ZooKeeper
2011-04-07 16:38:22,941 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
master:6-0x22f2c49d2590021 Creating (or updating) unassigned node for 
70236052 with OFFLINE state
2011-04-07 16:38:22,956 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: 
Server stopped; skipping assign of -ROOT-,,0.70236052 state=OFFLINE, 
ts=1302165502941
2011-04-07 16:38:32,746 INFO 
org.apache.hadoop.hbase.master.AssignmentManager$TimeoutMonitor: 
167-6-1-11:6.timeoutMonitor exiting
2011-04-07 16:39:22,770 INFO org.apache.hadoop.hbase.master.LogCleaner: 
master-167-6-1-11:6.oldLogCleaner exiting  


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-3750) HTablePool.putTable() should call table.flushCommits()

2011-04-06 Thread Ted Yu (JIRA)
HTablePool.putTable() should call table.flushCommits()
--

 Key: HBASE-3750
 URL: https://issues.apache.org/jira/browse/HBASE-3750
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.1
Reporter: Ted Yu
Assignee: Ted Yu


Currently HTablePool.putTable() doesn't call table.flushCommits()
This may turn out to be surprise for users

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-3750) HTablePool.putTable() should call table.flushCommits()

2011-04-06 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-3750:
--

Attachment: 3750.txt

 HTablePool.putTable() should call table.flushCommits()
 --

 Key: HBASE-3750
 URL: https://issues.apache.org/jira/browse/HBASE-3750
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.1
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: 3750.txt


 Currently HTablePool.putTable() doesn't call table.flushCommits()
 This may turn out to be surprise for users

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-3723) Major compact should be done when there is only one storefile and some keyvalue is outdated.

2011-04-06 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-3723.
--

  Resolution: Fixed
Hadoop Flags: [Reviewed]

Committed to TRUNK.  Thank you for the patch Zhou.

 Major compact should be done when there is only one storefile and some 
 keyvalue is outdated.
 

 Key: HBASE-3723
 URL: https://issues.apache.org/jira/browse/HBASE-3723
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 0.90.0, 0.90.1
Reporter: zhoushuaifeng
 Fix For: 0.90.2

 Attachments: hbase-3723.txt


 In the function store.isMajorCompaction:
   if (filesToCompact.size() == 1) {
 // Single file
 StoreFile sf = filesToCompact.get(0);
 long oldest =
 (sf.getReader().timeRangeTracker == null) ?
 Long.MIN_VALUE :
 now - sf.getReader().timeRangeTracker.minimumTimestamp;
 if (sf.isMajorCompaction() 
 (this.ttl == HConstants.FOREVER || oldest  this.ttl)) {
   if (LOG.isDebugEnabled()) {
 LOG.debug(Skipping major compaction of  + this.storeNameStr +
  because one (major) compacted file only and oldestTime  +
 oldest + ms is  ttl= + this.ttl);
   }
 }
   } else {
 When there is only one storefile in the store, and some keyvalues' TTL are 
 overtime, the majorcompactchecker should send this region to the compactquene 
 and run a majorcompact to clean these outdated data. But according to the 
 code in 0.90.1, it will do nothing. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-1364) [performance] Distributed splitting of regionserver commit logs

2011-04-06 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016681#comment-13016681
 ] 

Ted Yu commented on HBASE-1364:
---

I got the following from TestDistributedLogSplitting:
{code}
testOrphanLogCreation(org.apache.hadoop.hbase.master.TestDistributedLogSplitting)
  Time elapsed: 33.203 sec   ERROR!
java.lang.Exception: Unexpected exception, 
expectedorg.apache.hadoop.hbase.regionserver.wal.OrphanHLogAfterSplitException
 but wasjava.lang.Error
at 
org.junit.internal.runners.statements.ExpectException.evaluate(ExpectException.java:28)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
at 
org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:62)
at 
org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.executeTestSet(AbstractDirectoryTestSuite.java:140)
at 
org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.execute(AbstractDirectoryTestSuite.java:165)
at org.apache.maven.surefire.Surefire.run(Surefire.java:107)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.maven.surefire.booter.SurefireBooter.runSuitesInProcess(SurefireBooter.java:289)
at 
org.apache.maven.surefire.booter.SurefireBooter.main(SurefireBooter.java:1005)
Caused by: java.lang.Error: Unresolved compilation problem:
The method appendNoSync(HRegionInfo, byte[], WALEdit, long) is 
undefined for the type HLog
{code}


 [performance] Distributed splitting of regionserver commit logs
 ---

 Key: HBASE-1364
 URL: https://issues.apache.org/jira/browse/HBASE-1364
 Project: HBase
  Issue Type: Improvement
  Components: coprocessors
Reporter: stack
Assignee: Prakash Khemani
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-1364.patch

  Time Spent: 8h
  Remaining Estimate: 0h

 HBASE-1008 has some improvements to our log splitting on regionserver crash; 
 but it needs to run even faster.
 (Below is from HBASE-1008)
 In bigtable paper, the split is distributed. If we're going to have 1000 
 logs, we need to distribute or at least multithread the splitting.
 1. As is, regions starting up expect to find one reconstruction log only. 
 Need to make it so pick up a bunch of edit logs and it should be fine that 
 logs are elsewhere in hdfs in an output directory written by all split 
 participants whether multithreaded or a mapreduce-like distributed process 
 (Lets write our distributed sort first as a MR so we learn whats involved; 
 distributed sort, as much as possible should use MR framework pieces). On 
 startup, regions go to this directory and pick up the files written by split 
 participants deleting and clearing the dir when all have been read in. Making 
 it so can take multiple logs for input, can also make the split process more 
 robust rather than current tenuous process which loses all edits if it 
 doesn't make it to the end without error.
 2. Each column family rereads the reconstruction log to find its edits. Need 
 to fix that. Split can sort the edits by column family so store only reads 
 its edits.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3750) HTablePool.putTable() should call table.flushCommits()

2011-04-06 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016685#comment-13016685
 ] 

stack commented on HBASE-3750:
--

Do you think this the way to go Ted?  How you think people use the HTablePool?  
I'd think they'd check out an HTable instance, add an edit, then check it back 
in?  If this is the case, then we'll flush each single edit.  Or, do you think 
folks check out HTable instance, keep it around a while putting multiple edits 
on it and only then check it back in perhaps not using it again?

I wonder if a warning up in javadoc that we do NOT flush on return to the pool, 
so client needs to would be a better way to go?

I'm not sure.

 HTablePool.putTable() should call table.flushCommits()
 --

 Key: HBASE-3750
 URL: https://issues.apache.org/jira/browse/HBASE-3750
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.1
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: 3750.txt


 Currently HTablePool.putTable() doesn't call table.flushCommits()
 This may turn out to be surprise for users

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3734) HBaseAdmin creates new configurations in getCatalogTracker

2011-04-06 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016687#comment-13016687
 ] 

stack commented on HBASE-3734:
--

Is this the right way to go?  The javadoc on #getCatalogTracker says:

bq. @return A new CatalogTracker instance; call {@link 
#cleanupCatalogTracker(CatalogTracker)} to cleanup the returned catalog tracker.

Is the problem that cleanupCatalogTracker is not being called?

 HBaseAdmin creates new configurations in getCatalogTracker
 --

 Key: HBASE-3734
 URL: https://issues.apache.org/jira/browse/HBASE-3734
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.1
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
 Fix For: 0.90.3

 Attachments: HBASE-3734.patch


 HBaseAdmin.getCatalogTracker creates new Configuration every time it's 
 called, instead HBA should reuse the same one and do the copy inside the 
 constructor.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3747) [replication] ReplicationSource should differanciate remote and local exceptions

2011-04-06 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016688#comment-13016688
 ] 

stack commented on HBASE-3747:
--

+1

 [replication] ReplicationSource should differanciate remote and local 
 exceptions
 

 Key: HBASE-3747
 URL: https://issues.apache.org/jira/browse/HBASE-3747
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.90.1
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
Priority: Minor
 Fix For: 0.90.3

 Attachments: HBASE-3747.patch


 From Jeff Whiting on the list:
 I'm not sure...the key to everything was realizing picking up on 
 RemoteException in Unable to replicate because 
 org.apache.hadoop.ipc.RemoteException and realizing that it was on the 
 replication cluster.
 If it said something like Unable to replicate. Destination cluster threw an 
 exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: 
 java.lang.RuntimeException:
 Or something that makes it clear it is an exception on the remote or 
 destination cluster would be helpful.  It is easy to scan over 
 org.apache.hadoop.ipc.RemoteException and read it like 
 org.ap...some-kind-of...exception.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-3750) HTablePool.putTable() should call table.flushCommits()

2011-04-06 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-3750:
--

Attachment: (was: 3750.txt)

 HTablePool.putTable() should call table.flushCommits()
 --

 Key: HBASE-3750
 URL: https://issues.apache.org/jira/browse/HBASE-3750
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.1
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: 3750.txt


 Currently HTablePool.putTable() doesn't call table.flushCommits()
 This may turn out to be surprise for users

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-3750) HTablePool.putTable() should call table.flushCommits()

2011-04-06 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-3750:
--

Attachment: 3750.txt

Added check for isAutoFlush() before calling flushCommits()

 HTablePool.putTable() should call table.flushCommits()
 --

 Key: HBASE-3750
 URL: https://issues.apache.org/jira/browse/HBASE-3750
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.1
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: 3750.txt


 Currently HTablePool.putTable() doesn't call table.flushCommits()
 This may turn out to be surprise for users

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3744) createTable blocks until all regions are out of transition

2011-04-06 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016692#comment-13016692
 ] 

stack commented on HBASE-3744:
--

BulkStartupAssigner was for bulk assigning on startup.  Seems like its been 
pulled around.  Could BulkStartupAssigner be renamed BulkOpener?  And 
BulkDisabler named BulkCloser?  I see need of a BulkOpen on regionserver crash 
(if its not being used already).

 createTable blocks until all regions are out of transition
 --

 Key: HBASE-3744
 URL: https://issues.apache.org/jira/browse/HBASE-3744
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.1
Reporter: Todd Lipcon
Assignee: Ted Yu
 Fix For: 0.92.0

 Attachments: 3744.txt


 In HBASE-3305, the behavior of createTable was changed and introduced this 
 bug: createTable now blocks until all regions have been assigned, since it 
 uses BulkStartupAssigner. BulkStartupAssigner.waitUntilDone calls 
 assignmentManager.waitUntilNoRegionsInTransition, which waits across all 
 regions, not just the regions of the table that has just been created.
 We saw an issue where one table had a region which was unable to be opened, 
 so it was stuck in RegionsInTransition permanently (every open was failing). 
 Since this was the case, waitUntilDone would always block indefinitely even 
 though the newly created table had been assigned.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3750) HTablePool.putTable() should call table.flushCommits()

2011-04-06 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016691#comment-13016691
 ] 

Ted Yu commented on HBASE-3750:
---

Thanks for the review Stack.
I attached modified version.
I think if the user turns off AutoFlush, we have a reason to flush for them. We 
can also poll the user mailinglist to see their pattern.
Looking at javadoc:
{code}
 * Once you are done with it, return it to the pool with {@link 
#putTable(HTableInterface)}.
{code}
I don't think putting a single edit means 'done' with the table instance unless 
there was really just one edit.

 HTablePool.putTable() should call table.flushCommits()
 --

 Key: HBASE-3750
 URL: https://issues.apache.org/jira/browse/HBASE-3750
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.1
Reporter: Ted Yu
Assignee: Ted Yu
 Attachments: 3750.txt


 Currently HTablePool.putTable() doesn't call table.flushCommits()
 This may turn out to be surprise for users

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3743) Throttle major compaction

2011-04-06 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016694#comment-13016694
 ] 

stack commented on HBASE-3743:
--

As a workaround, we could run a script external to hbase that would first 
elicted the set of regions in a cluster and then per region, set in motion a 
major compaction waiting on completion before moving to the next region (Script 
could check hdfs and count storefiles in the region to figure completion of 
region major compaction).  The script could be run from cron or, as per the 
painting of the golden gate legend, once we'd gotten to the end of the 
bridge/table, we would loop around and start in again on the first region, in 
perpetuum.

 Throttle major compaction
 -

 Key: HBASE-3743
 URL: https://issues.apache.org/jira/browse/HBASE-3743
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Joep Rottinghuis

 Add the ability to throttle major compaction.
 For those use cases when a stop-the-world approach is not practical, it is 
 useful to be able to throttle the impact that major compaction has on the 
 cluster.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3722) A lot of data is lost when name node crashed

2011-04-06 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016696#comment-13016696
 ] 

stack commented on HBASE-3722:
--

That seems like an harmless addtion.  Do you think it would help w/ the issue 
you saw Gao Jinchao?  If so, I can commit.

  A lot of data is lost when name node crashed
 -

 Key: HBASE-3722
 URL: https://issues.apache.org/jira/browse/HBASE-3722
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.1
Reporter: gaojinchao
 Attachments: HmasterFilesystem_PatchV1.patch


 I'm not sure exactly what arose it. there is some split failed logs .
 the master should shutdown itself when the HDFS is crashed.
  The logs is :
  2011-03-22 13:21:55,056 WARN 
  org.apache.hadoop.hbase.master.LogCleaner: Error while cleaning the 
  logs
  java.net.ConnectException: Call to C4C1/157.5.100.1:9000 failed on 
 connection exception: java.net.ConnectException: Connection refused
  at org.apache.hadoop.ipc.Client.wrapException(Client.java:844)
  at org.apache.hadoop.ipc.Client.call(Client.java:820)
  at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:221)
  at $Proxy5.getListing(Unknown Source)
  at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
  at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  at java.lang.reflect.Method.invoke(Method.java:597)
  at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
  at 
 org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
  at $Proxy5.getListing(Unknown Source)
  at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:614)
  at 
 org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:252)
  at 
 org.apache.hadoop.hbase.master.LogCleaner.chore(LogCleaner.java:121)
  at org.apache.hadoop.hbase.Chore.run(Chore.java:66)
  at 
  org.apache.hadoop.hbase.master.LogCleaner.run(LogCleaner.java:154)
  Caused by: java.net.ConnectException: Connection refused
  at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
  at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
  at 
 org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
  at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:408)
  at 
 org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:332)
  at 
 org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:202)
  at org.apache.hadoop.ipc.Client.getConnection(Client.java:943)
  at org.apache.hadoop.ipc.Client.call(Client.java:788)
  ... 13 more
  2011-03-22 13:21:56,056 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: C4C1/157.5.100.1:9000. Already tried 0 time(s).
  2011-03-22 13:21:57,057 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: C4C1/157.5.100.1:9000. Already tried 1 time(s).
  2011-03-22 13:21:58,057 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: C4C1/157.5.100.1:9000. Already tried 2 time(s).
  2011-03-22 13:21:59,057 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: C4C1/157.5.100.1:9000. Already tried 3 time(s).
  2011-03-22 13:22:00,058 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: C4C1/157.5.100.1:9000. Already tried 4 time(s).
  2011-03-22 13:22:01,058 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: C4C1/157.5.100.1:9000. Already tried 5 time(s).
  2011-03-22 13:22:02,059 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: C4C1/157.5.100.1:9000. Already tried 6 time(s).
  2011-03-22 13:22:03,059 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: C4C1/157.5.100.1:9000. Already tried 7 time(s).
  2011-03-22 13:22:04,059 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: C4C1/157.5.100.1:9000. Already tried 8 time(s).
  2011-03-22 13:22:05,060 INFO org.apache.hadoop.ipc.Client: Retrying connect 
 to server: C4C1/157.5.100.1:9000. Already tried 9 time(s).
  2011-03-22 13:22:05,060 ERROR 
  org.apache.hadoop.hbase.master.MasterFileSystem: Failed splitting 
  hdfs://C4C1:9000/hbase/.logs/C4C9.site,60020,1300767633398
  java.net.ConnectException: Call to C4C1/157.5.100.1:9000 failed on 
 connection exception: java.net.ConnectException: Connection refused
  at org.apache.hadoop.ipc.Client.wrapException(Client.java:844)
  at org.apache.hadoop.ipc.Client.call(Client.java:820)
  at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:221)
  at $Proxy5.getFileInfo(Unknown Source)
  at 

[jira] [Commented] (HBASE-3729) Get cells via shell with a time range predicate

2011-04-06 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016697#comment-13016697
 ] 

stack commented on HBASE-3729:
--

So, it seems like I should get the right behavior if I take your last patch Ted 
and remove this portion:

{code}
+elsif args[TIMERANGE]
+  vers = 3
+else
+  vers = 1
{code}

Do you agree?  If so, I'll test it.  I'll add note to the help too about how 
user might want to add VERSIONS  1 when setting TIMERANGE.

 Get cells via shell with a time range predicate
 ---

 Key: HBASE-3729
 URL: https://issues.apache.org/jira/browse/HBASE-3729
 Project: HBase
  Issue Type: New Feature
  Components: shell
Reporter: Eric Charles
Assignee: Ted Yu
 Attachments: 3729-v2.txt, 3729-v3.txt, 3729-v4.txt, 3729.txt


 HBase shell allows to specify a timestamp to get a value
 - get 't1', 'r1', {COLUMN = 'c1', TIMESTAMP = ts1}
 If you don't give the exact timestamp, you get nothing... so it's difficult 
 to get the cell previous versions.
 It would be fine to have a time range predicate based get.
 The shell syntax could be (depending on technical feasibility)
 - get 't1', 'r1', {COLUMN = 'c1', TIMERANGE = (start_timestamp, 
 end_timestamp)}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3744) createTable blocks until all regions are out of transition

2011-04-06 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016698#comment-13016698
 ] 

Ted Yu commented on HBASE-3744:
---

BulkOpen isn't used.
We also have:
{code}
  class BulkEnabler extends BulkAssigner {
{code}

 createTable blocks until all regions are out of transition
 --

 Key: HBASE-3744
 URL: https://issues.apache.org/jira/browse/HBASE-3744
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.90.1
Reporter: Todd Lipcon
Assignee: Ted Yu
 Fix For: 0.92.0

 Attachments: 3744.txt


 In HBASE-3305, the behavior of createTable was changed and introduced this 
 bug: createTable now blocks until all regions have been assigned, since it 
 uses BulkStartupAssigner. BulkStartupAssigner.waitUntilDone calls 
 assignmentManager.waitUntilNoRegionsInTransition, which waits across all 
 regions, not just the regions of the table that has just been created.
 We saw an issue where one table had a region which was unable to be opened, 
 so it was stuck in RegionsInTransition permanently (every open was failing). 
 Since this was the case, waitUntilDone would always block indefinitely even 
 though the newly created table had been assigned.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira