[jira] [Commented] (HBASE-3677) Generate a globally unique identifier for a cluster and store in /hbase/hbase.id
[ https://issues.apache.org/jira/browse/HBASE-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016298#comment-13016298 ] Hudson commented on HBASE-3677: --- Integrated in HBase-TRUNK #1832 (See [https://hudson.apache.org/hudson/job/HBase-TRUNK/1832/]) HBASE-3677 Generate a globally unique cluster ID Generate a globally unique identifier for a cluster and store in /hbase/hbase.id Key: HBASE-3677 URL: https://issues.apache.org/jira/browse/HBASE-3677 Project: HBase Issue Type: Improvement Components: master Reporter: Gary Helmling Assignee: Gary Helmling Fix For: 0.92.0 Attachments: HBASE-3677_final.patch We don't currently have a way to uniquely identify an HBase cluster, apart for where it's stored in HDFS or configuration of the ZooKeeper quorum managing it. It would be generally useful to be able to identify a cluster via API. The proposal here is pretty simple: # When master initializes the filesystem, generate a globally unique ID and store in /hbase/hbase.id # For existing clusters, generate hbase.id on master startup if it does not exist # Include unique ID in ClusterStatus returned from master For token authentication, this will be required to allow selecting the correct token to pass to a cluster when a single client is communicating to more than one HBase instance. Chatting with J-D, replication stores it's own cluster id in place with each HLog edit, so requires as small as possible an identifier, but I think we could automate a mapping from unique cluster ID - short ID if we had the unique ID available. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3723) Major compact should be done when there is only one storefile and some keyvalue is outdated.
[ https://issues.apache.org/jira/browse/HBASE-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhoushuaifeng updated HBASE-3723: - Attachment: hbase-3723.txt When there is only one file in the store and some keyvalues are outdated, majorcompact should be triggered whatever the file is made by a majorcompact or not. Major compact should be done when there is only one storefile and some keyvalue is outdated. Key: HBASE-3723 URL: https://issues.apache.org/jira/browse/HBASE-3723 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.0, 0.90.1 Reporter: zhoushuaifeng Fix For: 0.90.2 Attachments: hbase-3723.txt In the function store.isMajorCompaction: if (filesToCompact.size() == 1) { // Single file StoreFile sf = filesToCompact.get(0); long oldest = (sf.getReader().timeRangeTracker == null) ? Long.MIN_VALUE : now - sf.getReader().timeRangeTracker.minimumTimestamp; if (sf.isMajorCompaction() (this.ttl == HConstants.FOREVER || oldest this.ttl)) { if (LOG.isDebugEnabled()) { LOG.debug(Skipping major compaction of + this.storeNameStr + because one (major) compacted file only and oldestTime + oldest + ms is ttl= + this.ttl); } } } else { When there is only one storefile in the store, and some keyvalues' TTL are overtime, the majorcompactchecker should send this region to the compactquene and run a majorcompact to clean these outdated data. But according to the code in 0.90.1, it will do nothing. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3722) A lot of data is lost when name node crashed
[ https://issues.apache.org/jira/browse/HBASE-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaojinchao updated HBASE-3722: -- Attachment: HmasterFilesystem_PatchV1.patch A lot of data is lost when name node crashed - Key: HBASE-3722 URL: https://issues.apache.org/jira/browse/HBASE-3722 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.1 Reporter: gaojinchao Attachments: HmasterFilesystem_PatchV1.patch I'm not sure exactly what arose it. there is some split failed logs . the master should shutdown itself when the HDFS is crashed. The logs is : 2011-03-22 13:21:55,056 WARN org.apache.hadoop.hbase.master.LogCleaner: Error while cleaning the logs java.net.ConnectException: Call to C4C1/157.5.100.1:9000 failed on connection exception: java.net.ConnectException: Connection refused at org.apache.hadoop.ipc.Client.wrapException(Client.java:844) at org.apache.hadoop.ipc.Client.call(Client.java:820) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:221) at $Proxy5.getListing(Unknown Source) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at $Proxy5.getListing(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:614) at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:252) at org.apache.hadoop.hbase.master.LogCleaner.chore(LogCleaner.java:121) at org.apache.hadoop.hbase.Chore.run(Chore.java:66) at org.apache.hadoop.hbase.master.LogCleaner.run(LogCleaner.java:154) Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:408) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:332) at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:202) at org.apache.hadoop.ipc.Client.getConnection(Client.java:943) at org.apache.hadoop.ipc.Client.call(Client.java:788) ... 13 more 2011-03-22 13:21:56,056 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: C4C1/157.5.100.1:9000. Already tried 0 time(s). 2011-03-22 13:21:57,057 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: C4C1/157.5.100.1:9000. Already tried 1 time(s). 2011-03-22 13:21:58,057 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: C4C1/157.5.100.1:9000. Already tried 2 time(s). 2011-03-22 13:21:59,057 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: C4C1/157.5.100.1:9000. Already tried 3 time(s). 2011-03-22 13:22:00,058 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: C4C1/157.5.100.1:9000. Already tried 4 time(s). 2011-03-22 13:22:01,058 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: C4C1/157.5.100.1:9000. Already tried 5 time(s). 2011-03-22 13:22:02,059 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: C4C1/157.5.100.1:9000. Already tried 6 time(s). 2011-03-22 13:22:03,059 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: C4C1/157.5.100.1:9000. Already tried 7 time(s). 2011-03-22 13:22:04,059 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: C4C1/157.5.100.1:9000. Already tried 8 time(s). 2011-03-22 13:22:05,060 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: C4C1/157.5.100.1:9000. Already tried 9 time(s). 2011-03-22 13:22:05,060 ERROR org.apache.hadoop.hbase.master.MasterFileSystem: Failed splitting hdfs://C4C1:9000/hbase/.logs/C4C9.site,60020,1300767633398 java.net.ConnectException: Call to C4C1/157.5.100.1:9000 failed on connection exception: java.net.ConnectException: Connection refused at org.apache.hadoop.ipc.Client.wrapException(Client.java:844) at org.apache.hadoop.ipc.Client.call(Client.java:820) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:221) at $Proxy5.getFileInfo(Unknown Source) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at
[jira] [Commented] (HBASE-3722) A lot of data is lost when name node crashed
[ https://issues.apache.org/jira/browse/HBASE-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016350#comment-13016350 ] gaojinchao commented on HBASE-3722: --- I try to fix this bug. who do review it? thanks A lot of data is lost when name node crashed - Key: HBASE-3722 URL: https://issues.apache.org/jira/browse/HBASE-3722 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.1 Reporter: gaojinchao Attachments: HmasterFilesystem_PatchV1.patch I'm not sure exactly what arose it. there is some split failed logs . the master should shutdown itself when the HDFS is crashed. The logs is : 2011-03-22 13:21:55,056 WARN org.apache.hadoop.hbase.master.LogCleaner: Error while cleaning the logs java.net.ConnectException: Call to C4C1/157.5.100.1:9000 failed on connection exception: java.net.ConnectException: Connection refused at org.apache.hadoop.ipc.Client.wrapException(Client.java:844) at org.apache.hadoop.ipc.Client.call(Client.java:820) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:221) at $Proxy5.getListing(Unknown Source) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at $Proxy5.getListing(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:614) at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:252) at org.apache.hadoop.hbase.master.LogCleaner.chore(LogCleaner.java:121) at org.apache.hadoop.hbase.Chore.run(Chore.java:66) at org.apache.hadoop.hbase.master.LogCleaner.run(LogCleaner.java:154) Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:408) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:332) at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:202) at org.apache.hadoop.ipc.Client.getConnection(Client.java:943) at org.apache.hadoop.ipc.Client.call(Client.java:788) ... 13 more 2011-03-22 13:21:56,056 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: C4C1/157.5.100.1:9000. Already tried 0 time(s). 2011-03-22 13:21:57,057 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: C4C1/157.5.100.1:9000. Already tried 1 time(s). 2011-03-22 13:21:58,057 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: C4C1/157.5.100.1:9000. Already tried 2 time(s). 2011-03-22 13:21:59,057 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: C4C1/157.5.100.1:9000. Already tried 3 time(s). 2011-03-22 13:22:00,058 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: C4C1/157.5.100.1:9000. Already tried 4 time(s). 2011-03-22 13:22:01,058 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: C4C1/157.5.100.1:9000. Already tried 5 time(s). 2011-03-22 13:22:02,059 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: C4C1/157.5.100.1:9000. Already tried 6 time(s). 2011-03-22 13:22:03,059 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: C4C1/157.5.100.1:9000. Already tried 7 time(s). 2011-03-22 13:22:04,059 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: C4C1/157.5.100.1:9000. Already tried 8 time(s). 2011-03-22 13:22:05,060 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: C4C1/157.5.100.1:9000. Already tried 9 time(s). 2011-03-22 13:22:05,060 ERROR org.apache.hadoop.hbase.master.MasterFileSystem: Failed splitting hdfs://C4C1:9000/hbase/.logs/C4C9.site,60020,1300767633398 java.net.ConnectException: Call to C4C1/157.5.100.1:9000 failed on connection exception: java.net.ConnectException: Connection refused at org.apache.hadoop.ipc.Client.wrapException(Client.java:844) at org.apache.hadoop.ipc.Client.call(Client.java:820) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:221) at $Proxy5.getFileInfo(Unknown Source) at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) at
[jira] [Updated] (HBASE-3710) Book.xml - fill out descriptions of metrics
[ https://issues.apache.org/jira/browse/HBASE-3710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Meil updated HBASE-3710: - Attachment: book.xml.patch Book.xml - fill out descriptions of metrics --- Key: HBASE-3710 URL: https://issues.apache.org/jira/browse/HBASE-3710 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: book.xml.patch I filled out the skeleton of the metrics in book.xml, but I'd like one of the committers to fill in the rest of the details on what these mean. Thanks! :-) For example.. --- I'm assuming that these are referring to the LRU block cache in memory. I'd like to docs to state this (for the sake of clarity) and also the units (e.g., MB). hbase.regionserver.blockCacheCount hbase.regionserver.blockCacheFree hbase.regionserver.blockCacheHitRatio hbase.regionserver.blockCacheSize --- This is read latency from HDFS, I assume... hbase.regionserver.fsReadLatency_avg_time hbase.regionserver.fsReadLatency_num_ops hbase.regionserver.fsSyncLatency_avg_time hbase.regionserver.fsSyncLatency_num_ops hbase.regionserver.fsWriteLatency_avg_time hbase.regionserver.fsWriteLatency_num_ops point in time utilized (i.e., as opposed to max or trailing) memstore I assume, would be nice to document. hbase.regionserver.memstoreSizeMB --- obvious, but might as well document it hbase.regionserver.regions -- This is any Put, Get, Delete, or Scan operation, I assume? hbase.regionserver.requests -- detail on these would be nice, especially for tips on if there are any critical numbers/ratios to watch for. hbase.regionserver.storeFileIndexSizeMB hbase.regionserver.stores -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3710) Book.xml - fill out descriptions of metrics
[ https://issues.apache.org/jira/browse/HBASE-3710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Meil updated HBASE-3710: - Status: Patch Available (was: Open) Book.xml - fill out descriptions of metrics --- Key: HBASE-3710 URL: https://issues.apache.org/jira/browse/HBASE-3710 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: book.xml.patch I filled out the skeleton of the metrics in book.xml, but I'd like one of the committers to fill in the rest of the details on what these mean. Thanks! :-) For example.. --- I'm assuming that these are referring to the LRU block cache in memory. I'd like to docs to state this (for the sake of clarity) and also the units (e.g., MB). hbase.regionserver.blockCacheCount hbase.regionserver.blockCacheFree hbase.regionserver.blockCacheHitRatio hbase.regionserver.blockCacheSize --- This is read latency from HDFS, I assume... hbase.regionserver.fsReadLatency_avg_time hbase.regionserver.fsReadLatency_num_ops hbase.regionserver.fsSyncLatency_avg_time hbase.regionserver.fsSyncLatency_num_ops hbase.regionserver.fsWriteLatency_avg_time hbase.regionserver.fsWriteLatency_num_ops point in time utilized (i.e., as opposed to max or trailing) memstore I assume, would be nice to document. hbase.regionserver.memstoreSizeMB --- obvious, but might as well document it hbase.regionserver.regions -- This is any Put, Get, Delete, or Scan operation, I assume? hbase.regionserver.requests -- detail on these would be nice, especially for tips on if there are any critical numbers/ratios to watch for. hbase.regionserver.storeFileIndexSizeMB hbase.regionserver.stores -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3710) Book.xml - fill out descriptions of metrics
[ https://issues.apache.org/jira/browse/HBASE-3710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Meil updated HBASE-3710: - Attachment: (was: book.xml.patch) Book.xml - fill out descriptions of metrics --- Key: HBASE-3710 URL: https://issues.apache.org/jira/browse/HBASE-3710 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: book.xml.patch I filled out the skeleton of the metrics in book.xml, but I'd like one of the committers to fill in the rest of the details on what these mean. Thanks! :-) For example.. --- I'm assuming that these are referring to the LRU block cache in memory. I'd like to docs to state this (for the sake of clarity) and also the units (e.g., MB). hbase.regionserver.blockCacheCount hbase.regionserver.blockCacheFree hbase.regionserver.blockCacheHitRatio hbase.regionserver.blockCacheSize --- This is read latency from HDFS, I assume... hbase.regionserver.fsReadLatency_avg_time hbase.regionserver.fsReadLatency_num_ops hbase.regionserver.fsSyncLatency_avg_time hbase.regionserver.fsSyncLatency_num_ops hbase.regionserver.fsWriteLatency_avg_time hbase.regionserver.fsWriteLatency_num_ops point in time utilized (i.e., as opposed to max or trailing) memstore I assume, would be nice to document. hbase.regionserver.memstoreSizeMB --- obvious, but might as well document it hbase.regionserver.regions -- This is any Put, Get, Delete, or Scan operation, I assume? hbase.regionserver.requests -- detail on these would be nice, especially for tips on if there are any critical numbers/ratios to watch for. hbase.regionserver.storeFileIndexSizeMB hbase.regionserver.stores -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3710) Book.xml - fill out descriptions of metrics
[ https://issues.apache.org/jira/browse/HBASE-3710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Meil updated HBASE-3710: - Attachment: book.xml.patch Book.xml - fill out descriptions of metrics --- Key: HBASE-3710 URL: https://issues.apache.org/jira/browse/HBASE-3710 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: book.xml.patch I filled out the skeleton of the metrics in book.xml, but I'd like one of the committers to fill in the rest of the details on what these mean. Thanks! :-) For example.. --- I'm assuming that these are referring to the LRU block cache in memory. I'd like to docs to state this (for the sake of clarity) and also the units (e.g., MB). hbase.regionserver.blockCacheCount hbase.regionserver.blockCacheFree hbase.regionserver.blockCacheHitRatio hbase.regionserver.blockCacheSize --- This is read latency from HDFS, I assume... hbase.regionserver.fsReadLatency_avg_time hbase.regionserver.fsReadLatency_num_ops hbase.regionserver.fsSyncLatency_avg_time hbase.regionserver.fsSyncLatency_num_ops hbase.regionserver.fsWriteLatency_avg_time hbase.regionserver.fsWriteLatency_num_ops point in time utilized (i.e., as opposed to max or trailing) memstore I assume, would be nice to document. hbase.regionserver.memstoreSizeMB --- obvious, but might as well document it hbase.regionserver.regions -- This is any Put, Get, Delete, or Scan operation, I assume? hbase.regionserver.requests -- detail on these would be nice, especially for tips on if there are any critical numbers/ratios to watch for. hbase.regionserver.storeFileIndexSizeMB hbase.regionserver.stores -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-1512) Coprocessors: Support aggregate functions
[ https://issues.apache.org/jira/browse/HBASE-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016496#comment-13016496 ] Himanshu Vashishtha commented on HBASE-1512: Thanks for review Ted. What I think in the divide method is returning Double.NaN if either of operand is null. Any operation on null should give null. Ok for the name refactoring. I don't have any strong feeling for making a separate class out if it at this point of time, as it doesn't add much on its own. But will do it if you say so. Coprocessors: Support aggregate functions - Key: HBASE-1512 URL: https://issues.apache.org/jira/browse/HBASE-1512 Project: HBase Issue Type: Sub-task Components: coprocessors Reporter: stack Attachments: 1512.zip, AggregateCpProtocol.java, AggregateProtocolImpl.java, AggregationClient.java, ColumnInterpreter.java, patch-1512-2.txt, patch-1512-3.txt, patch-1512.txt Chatting with jgray and holstad at the kitchen table about counts, sums, and other aggregating facility, facility generally where you want to calculate some meta info on your table, it seems like it wouldn't be too hard making a filter type that could run a function server-side and return the result ONLY of the aggregation or whatever. For example, say you just want to count rows, currently you scan, server returns all data to client and count is done by client counting up row keys. A bunch of time and resources have been wasted returning data that we're not interested in. With this new filter type, the counting would be done server-side and then it would make up a new result that was the count only (kinda like mysql when you ask it to count, it returns a 'table' with a count column whose value is count of rows). We could have it so the count was just done per region and return that. Or we could maybe make a small change in scanner too so that it aggregated the per-region counts. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-3743) Throttle major compaction
Throttle major compaction - Key: HBASE-3743 URL: https://issues.apache.org/jira/browse/HBASE-3743 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Joep Rottinghuis Add the ability to throttle major compaction. For those use cases when a stop-the-world approach is not practical, it is useful to be able to throttle the impact that major compaction has on the cluster. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-1512) Coprocessors: Support aggregate functions
[ https://issues.apache.org/jira/browse/HBASE-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016519#comment-13016519 ] Ted Yu commented on HBASE-1512: --- I think returning Double.NaN is fine. Normally either of operand being null would lead to NPE. For making a separate class, it would be easier for users to produce other ColumnInterpreter classes based on LongColumnInterpreter. Coprocessors: Support aggregate functions - Key: HBASE-1512 URL: https://issues.apache.org/jira/browse/HBASE-1512 Project: HBase Issue Type: Sub-task Components: coprocessors Reporter: stack Attachments: 1512.zip, AggregateCpProtocol.java, AggregateProtocolImpl.java, AggregationClient.java, ColumnInterpreter.java, patch-1512-2.txt, patch-1512-3.txt, patch-1512.txt Chatting with jgray and holstad at the kitchen table about counts, sums, and other aggregating facility, facility generally where you want to calculate some meta info on your table, it seems like it wouldn't be too hard making a filter type that could run a function server-side and return the result ONLY of the aggregation or whatever. For example, say you just want to count rows, currently you scan, server returns all data to client and count is done by client counting up row keys. A bunch of time and resources have been wasted returning data that we're not interested in. With this new filter type, the counting would be done server-side and then it would make up a new result that was the count only (kinda like mysql when you ask it to count, it returns a 'table' with a count column whose value is count of rows). We could have it so the count was just done per region and return that. Or we could maybe make a small change in scanner too so that it aggregated the per-region counts. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-3744) createTable blocks until all regions are out of transition
createTable blocks until all regions are out of transition -- Key: HBASE-3744 URL: https://issues.apache.org/jira/browse/HBASE-3744 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.1 Reporter: Todd Lipcon Fix For: 0.92.0 In HBASE-3305, the behavior of createTable was changed and introduced this bug: createTable now blocks until all regions have been assigned, since it uses BulkStartupAssigner. BulkStartupAssigner.waitUntilDone calls assignmentManager.waitUntilNoRegionsInTransition, which waits across all regions, not just the regions of the table that has just been created. We saw an issue where one table had a region which was unable to be opened, so it was stuck in RegionsInTransition permanently (every open was failing). Since this was the case, waitUntilDone would always block indefinitely even though the newly created table had been assigned. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-3745) Add the ability to restrict major-compactible files by timestamp
Add the ability to restrict major-compactible files by timestamp Key: HBASE-3745 URL: https://issues.apache.org/jira/browse/HBASE-3745 Project: HBase Issue Type: Improvement Affects Versions: 0.92.0 Reporter: Todd Lipcon In some applications, a common access pattern is to frequently scan tables with a time range predicate restricted to a fairly recent time window. For example, you may want to do an incremental aggregation or indexing step only on rows that have changed in the last hour. We do this efficiently by tracking min and max timestamp on an HFile level, so that old HFiles don't have to be read. After a major compaction, however, the entire dataset will need to be read, which can hurt performance of this access pattern. We should add a column family attribute that can specify a policy like: When major compacting, never include an HFile that contains data with a timestamp in the last 4 hours. This, recently flushed HFiles will always be uncompacted and provide the good scan performance required for these applications. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3744) createTable blocks until all regions are out of transition
[ https://issues.apache.org/jira/browse/HBASE-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-3744: -- Attachment: 3744.txt First attempt for this issue. I don't see sync parameter for createTable() in HMasterInterface TestAdmin passes. createTable blocks until all regions are out of transition -- Key: HBASE-3744 URL: https://issues.apache.org/jira/browse/HBASE-3744 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.1 Reporter: Todd Lipcon Assignee: Ted Yu Fix For: 0.92.0 Attachments: 3744.txt In HBASE-3305, the behavior of createTable was changed and introduced this bug: createTable now blocks until all regions have been assigned, since it uses BulkStartupAssigner. BulkStartupAssigner.waitUntilDone calls assignmentManager.waitUntilNoRegionsInTransition, which waits across all regions, not just the regions of the table that has just been created. We saw an issue where one table had a region which was unable to be opened, so it was stuck in RegionsInTransition permanently (every open was failing). Since this was the case, waitUntilDone would always block indefinitely even though the newly created table had been assigned. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Work started] (HBASE-3744) createTable blocks until all regions are out of transition
[ https://issues.apache.org/jira/browse/HBASE-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HBASE-3744 started by Ted Yu. createTable blocks until all regions are out of transition -- Key: HBASE-3744 URL: https://issues.apache.org/jira/browse/HBASE-3744 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.1 Reporter: Todd Lipcon Assignee: Ted Yu Fix For: 0.92.0 Attachments: 3744.txt In HBASE-3305, the behavior of createTable was changed and introduced this bug: createTable now blocks until all regions have been assigned, since it uses BulkStartupAssigner. BulkStartupAssigner.waitUntilDone calls assignmentManager.waitUntilNoRegionsInTransition, which waits across all regions, not just the regions of the table that has just been created. We saw an issue where one table had a region which was unable to be opened, so it was stuck in RegionsInTransition permanently (every open was failing). Since this was the case, waitUntilDone would always block indefinitely even though the newly created table had been assigned. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3587) Eliminate use of ReadWriteLock in RegionObserver coprocessor invocation
[ https://issues.apache.org/jira/browse/HBASE-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016566#comment-13016566 ] Gary Helmling commented on HBASE-3587: -- Posted patch for review at: https://review.cloudera.org/r/1681/ Eliminate use of ReadWriteLock in RegionObserver coprocessor invocation --- Key: HBASE-3587 URL: https://issues.apache.org/jira/browse/HBASE-3587 Project: HBase Issue Type: Improvement Components: coprocessors Reporter: Gary Helmling Assignee: Gary Helmling Follow-up to a discussion on the dev list: http://search-hadoop.com/m/jOovV1uAJBP The CoprocessorHost ReentrantReadWriteLock is imposing some overhead on data read/write operations, even when no coprocessors are loaded. Currently execution of RegionCoprocessorHost pre/postXXX() methods are guarded by acquiring the coprocessor read lock. This is used to prevent coprocessor registration from modifying the coprocessor collection while upcall hooks are in progress. On further discussion, and looking at the locking in HRegion, it should be sufficient to just use a CopyOnWriteArrayList for the coprocessor collection. We can then remove the coprocessor lock and eliminate the associated overhead without having to special case the no loaded coprocessors condition. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-3746) Clean up CompressionTest to not directly reference DistributedFileSystem
Clean up CompressionTest to not directly reference DistributedFileSystem Key: HBASE-3746 URL: https://issues.apache.org/jira/browse/HBASE-3746 Project: HBase Issue Type: Improvement Affects Versions: 0.90.1 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.90.3 Right now, CompressionTest has a number of issues: - it always writes to the home directory of the user, regardless of the path provided - it requires actually writing to HDFS when a local file is probably sufficient -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3744) createTable blocks until all regions are out of transition
[ https://issues.apache.org/jira/browse/HBASE-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016577#comment-13016577 ] Ted Yu commented on HBASE-3744: --- From IRC: tlipcon_: it seems wrong that we use something called BukStartupAssigner in the first place [2:54pm] tlipcon_: since this is not startup ... tlipcon_: i think BulkStartupAssigner should be renamed [3:11pm] tlipcon_: and then either a parameter or a different subclass that doesn't wait createTable blocks until all regions are out of transition -- Key: HBASE-3744 URL: https://issues.apache.org/jira/browse/HBASE-3744 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.1 Reporter: Todd Lipcon Assignee: Ted Yu Fix For: 0.92.0 Attachments: 3744.txt In HBASE-3305, the behavior of createTable was changed and introduced this bug: createTable now blocks until all regions have been assigned, since it uses BulkStartupAssigner. BulkStartupAssigner.waitUntilDone calls assignmentManager.waitUntilNoRegionsInTransition, which waits across all regions, not just the regions of the table that has just been created. We saw an issue where one table had a region which was unable to be opened, so it was stuck in RegionsInTransition permanently (every open was failing). Since this was the case, waitUntilDone would always block indefinitely even though the newly created table had been assigned. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3744) createTable blocks until all regions are out of transition
[ https://issues.apache.org/jira/browse/HBASE-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016583#comment-13016583 ] Ted Yu commented on HBASE-3744: --- Currently we have: {code} static class BulkStartupAssigner extends BulkAssigner { {code} We also have: {code} class BulkDisabler extends BulkAssigner { {code} If we rename BulkAssigner as AbstractBulkAssigner, we can rename BulkStartupAssigner as BulkAssigner createTable blocks until all regions are out of transition -- Key: HBASE-3744 URL: https://issues.apache.org/jira/browse/HBASE-3744 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.1 Reporter: Todd Lipcon Assignee: Ted Yu Fix For: 0.92.0 Attachments: 3744.txt In HBASE-3305, the behavior of createTable was changed and introduced this bug: createTable now blocks until all regions have been assigned, since it uses BulkStartupAssigner. BulkStartupAssigner.waitUntilDone calls assignmentManager.waitUntilNoRegionsInTransition, which waits across all regions, not just the regions of the table that has just been created. We saw an issue where one table had a region which was unable to be opened, so it was stuck in RegionsInTransition permanently (every open was failing). Since this was the case, waitUntilDone would always block indefinitely even though the newly created table had been assigned. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-1512) Coprocessors: Support aggregate functions
[ https://issues.apache.org/jira/browse/HBASE-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016585#comment-13016585 ] Ted Yu commented on HBASE-1512: --- Version 4 is awesome. Coprocessors: Support aggregate functions - Key: HBASE-1512 URL: https://issues.apache.org/jira/browse/HBASE-1512 Project: HBase Issue Type: Sub-task Components: coprocessors Reporter: stack Attachments: 1512.zip, AggregateCpProtocol.java, AggregateProtocolImpl.java, AggregationClient.java, ColumnInterpreter.java, patch-1512-2.txt, patch-1512-3.txt, patch-1512-4.txt, patch-1512.txt Chatting with jgray and holstad at the kitchen table about counts, sums, and other aggregating facility, facility generally where you want to calculate some meta info on your table, it seems like it wouldn't be too hard making a filter type that could run a function server-side and return the result ONLY of the aggregation or whatever. For example, say you just want to count rows, currently you scan, server returns all data to client and count is done by client counting up row keys. A bunch of time and resources have been wasted returning data that we're not interested in. With this new filter type, the counting would be done server-side and then it would make up a new result that was the count only (kinda like mysql when you ask it to count, it returns a 'table' with a count column whose value is count of rows). We could have it so the count was just done per region and return that. Or we could maybe make a small change in scanner too so that it aggregated the per-region counts. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-3747) [replication] ReplicationSource should differanciate remote and local exceptions
[replication] ReplicationSource should differanciate remote and local exceptions Key: HBASE-3747 URL: https://issues.apache.org/jira/browse/HBASE-3747 Project: HBase Issue Type: Improvement Affects Versions: 0.90.1 Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Minor Fix For: 0.90.3 From Jeff Whiting on the list: I'm not sure...the key to everything was realizing picking up on RemoteException in Unable to replicate because org.apache.hadoop.ipc.RemoteException and realizing that it was on the replication cluster. If it said something like Unable to replicate. Destination cluster threw an exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: java.lang.RuntimeException: Or something that makes it clear it is an exception on the remote or destination cluster would be helpful. It is easy to scan over org.apache.hadoop.ipc.RemoteException and read it like org.ap...some-kind-of...exception. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-3748) Add rolling of thrift/rest daemons to graceful_stop.sh script
Add rolling of thrift/rest daemons to graceful_stop.sh script - Key: HBASE-3748 URL: https://issues.apache.org/jira/browse/HBASE-3748 Project: HBase Issue Type: Task Affects Versions: 0.92.0 Reporter: stack Assignee: stack Add option to stop/start thrift and rest servers as part of a rolling a server. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-3748) Add rolling of thrift/rest daemons to graceful_stop.sh script
[ https://issues.apache.org/jira/browse/HBASE-3748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-3748. -- Resolution: Fixed Fix Version/s: 0.92.0 Applied to trunk and branch... and added some more to the decommission section in the book. Add rolling of thrift/rest daemons to graceful_stop.sh script - Key: HBASE-3748 URL: https://issues.apache.org/jira/browse/HBASE-3748 Project: HBase Issue Type: Task Affects Versions: 0.92.0 Reporter: stack Assignee: stack Fix For: 0.92.0 Add option to stop/start thrift and rest servers as part of a rolling a server. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3746) Clean up CompressionTest to not directly reference DistributedFileSystem
[ https://issues.apache.org/jira/browse/HBASE-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016605#comment-13016605 ] stack commented on HBASE-3746: -- +1 Nice. Clean up CompressionTest to not directly reference DistributedFileSystem Key: HBASE-3746 URL: https://issues.apache.org/jira/browse/HBASE-3746 Project: HBase Issue Type: Improvement Affects Versions: 0.90.1 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.90.3 Attachments: hbase-3746.txt Right now, CompressionTest has a number of issues: - it always writes to the home directory of the user, regardless of the path provided - it requires actually writing to HDFS when a local file is probably sufficient -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3587) Eliminate use of ReadWriteLock in RegionObserver coprocessor invocation
[ https://issues.apache.org/jira/browse/HBASE-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016607#comment-13016607 ] stack commented on HBASE-3587: -- +1 Nice one Gary. Eliminate use of ReadWriteLock in RegionObserver coprocessor invocation --- Key: HBASE-3587 URL: https://issues.apache.org/jira/browse/HBASE-3587 Project: HBase Issue Type: Improvement Components: coprocessors Reporter: Gary Helmling Assignee: Gary Helmling Follow-up to a discussion on the dev list: http://search-hadoop.com/m/jOovV1uAJBP The CoprocessorHost ReentrantReadWriteLock is imposing some overhead on data read/write operations, even when no coprocessors are loaded. Currently execution of RegionCoprocessorHost pre/postXXX() methods are guarded by acquiring the coprocessor read lock. This is used to prevent coprocessor registration from modifying the coprocessor collection while upcall hooks are in progress. On further discussion, and looking at the locking in HRegion, it should be sufficient to just use a CopyOnWriteArrayList for the coprocessor collection. We can then remove the coprocessor lock and eliminate the associated overhead without having to special case the no loaded coprocessors condition. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3747) [replication] ReplicationSource should differanciate remote and local exceptions
[ https://issues.apache.org/jira/browse/HBASE-3747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Daniel Cryans updated HBASE-3747: -- Attachment: HBASE-3747.patch Patch that adds better logging of the issue and adds some refactoring for RemoteException handling. [replication] ReplicationSource should differanciate remote and local exceptions Key: HBASE-3747 URL: https://issues.apache.org/jira/browse/HBASE-3747 Project: HBase Issue Type: Improvement Affects Versions: 0.90.1 Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Minor Fix For: 0.90.3 Attachments: HBASE-3747.patch From Jeff Whiting on the list: I'm not sure...the key to everything was realizing picking up on RemoteException in Unable to replicate because org.apache.hadoop.ipc.RemoteException and realizing that it was on the replication cluster. If it said something like Unable to replicate. Destination cluster threw an exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: java.lang.RuntimeException: Or something that makes it clear it is an exception on the remote or destination cluster would be helpful. It is easy to scan over org.apache.hadoop.ipc.RemoteException and read it like org.ap...some-kind-of...exception. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3734) HBaseAdmin creates new configurations in getCatalogTracker
[ https://issues.apache.org/jira/browse/HBASE-3734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Daniel Cryans updated HBASE-3734: -- Attachment: HBASE-3734.patch Patch that adds the copy in HBA's constructor and removes the copy in getCatalogTracker, passes unit tests. HBaseAdmin creates new configurations in getCatalogTracker -- Key: HBASE-3734 URL: https://issues.apache.org/jira/browse/HBASE-3734 Project: HBase Issue Type: Bug Affects Versions: 0.90.1 Reporter: Jean-Daniel Cryans Fix For: 0.90.3 Attachments: HBASE-3734.patch HBaseAdmin.getCatalogTracker creates new Configuration every time it's called, instead HBA should reuse the same one and do the copy inside the constructor. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3710) Book.xml - fill out descriptions of metrics
[ https://issues.apache.org/jira/browse/HBASE-3710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-3710: - Resolution: Fixed Fix Version/s: 0.92.0 Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Committed to TRUNK. Thanks for patch Doug. The TODOs are fine. Book.xml - fill out descriptions of metrics --- Key: HBASE-3710 URL: https://issues.apache.org/jira/browse/HBASE-3710 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Fix For: 0.92.0 Attachments: book.xml.patch I filled out the skeleton of the metrics in book.xml, but I'd like one of the committers to fill in the rest of the details on what these mean. Thanks! :-) For example.. --- I'm assuming that these are referring to the LRU block cache in memory. I'd like to docs to state this (for the sake of clarity) and also the units (e.g., MB). hbase.regionserver.blockCacheCount hbase.regionserver.blockCacheFree hbase.regionserver.blockCacheHitRatio hbase.regionserver.blockCacheSize --- This is read latency from HDFS, I assume... hbase.regionserver.fsReadLatency_avg_time hbase.regionserver.fsReadLatency_num_ops hbase.regionserver.fsSyncLatency_avg_time hbase.regionserver.fsSyncLatency_num_ops hbase.regionserver.fsWriteLatency_avg_time hbase.regionserver.fsWriteLatency_num_ops point in time utilized (i.e., as opposed to max or trailing) memstore I assume, would be nice to document. hbase.regionserver.memstoreSizeMB --- obvious, but might as well document it hbase.regionserver.regions -- This is any Put, Get, Delete, or Scan operation, I assume? hbase.regionserver.requests -- detail on these would be nice, especially for tips on if there are any critical numbers/ratios to watch for. hbase.regionserver.storeFileIndexSizeMB hbase.regionserver.stores -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3738) Book.xml - expanding Architecture Client section
[ https://issues.apache.org/jira/browse/HBASE-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-3738: - Resolution: Fixed Fix Version/s: 0.92.0 Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Committed to TRUNK. Thanks for the patch Doug. In future, try exorcising tabs, do ~80 characters a line, and name your patch for the issue number instead of calling it book.xml.patch each time. Thanks Doug. Our book is coming along nicely. Book.xml - expanding Architecture Client section Key: HBASE-3738 URL: https://issues.apache.org/jira/browse/HBASE-3738 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Fix For: 0.92.0 Attachments: book.xml.patch Expanded the Architecture Client section. Broke 'connection' into sub-section, and created 'writebuffer and batch methods' into another sub-section. Both seem to be fairly frequent questions on the dist-list. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3746) Clean up CompressionTest to not directly reference DistributedFileSystem
[ https://issues.apache.org/jira/browse/HBASE-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HBASE-3746: --- Attachment: hbase-3746.txt Slightly updated patch to make it exit after calling usage(), and proper example as file:///tmp instead of file://tmp Committing to trunk and branch. Clean up CompressionTest to not directly reference DistributedFileSystem Key: HBASE-3746 URL: https://issues.apache.org/jira/browse/HBASE-3746 Project: HBase Issue Type: Improvement Affects Versions: 0.90.1 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.90.3 Attachments: hbase-3746.txt, hbase-3746.txt Right now, CompressionTest has a number of issues: - it always writes to the home directory of the user, regardless of the path provided - it requires actually writing to HDFS when a local file is probably sufficient -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3738) Book.xml - expanding Architecture Client section
[ https://issues.apache.org/jira/browse/HBASE-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016620#comment-13016620 ] Doug Meil commented on HBASE-3738: -- Roger that! Sorry about the tabs. For some reason I thought I was helping by using 'book.xml.patch' every time. It makes sense to name it for the edit. Book.xml - expanding Architecture Client section Key: HBASE-3738 URL: https://issues.apache.org/jira/browse/HBASE-3738 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Fix For: 0.92.0 Attachments: book.xml.patch Expanded the Architecture Client section. Broke 'connection' into sub-section, and created 'writebuffer and batch methods' into another sub-section. Both seem to be fairly frequent questions on the dist-list. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3746) Clean up CompressionTest to not directly reference DistributedFileSystem
[ https://issues.apache.org/jira/browse/HBASE-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HBASE-3746: --- Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Clean up CompressionTest to not directly reference DistributedFileSystem Key: HBASE-3746 URL: https://issues.apache.org/jira/browse/HBASE-3746 Project: HBase Issue Type: Improvement Affects Versions: 0.90.1 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.90.3 Attachments: hbase-3746.txt, hbase-3746.txt Right now, CompressionTest has a number of issues: - it always writes to the home directory of the user, regardless of the path provided - it requires actually writing to HDFS when a local file is probably sufficient -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3587) Eliminate use of ReadWriteLock in RegionObserver coprocessor invocation
[ https://issues.apache.org/jira/browse/HBASE-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Helmling updated HBASE-3587: - Attachment: HBASE-3587.patch Patch committed to trunk Eliminate use of ReadWriteLock in RegionObserver coprocessor invocation --- Key: HBASE-3587 URL: https://issues.apache.org/jira/browse/HBASE-3587 Project: HBase Issue Type: Improvement Components: coprocessors Reporter: Gary Helmling Assignee: Gary Helmling Fix For: 0.92.0 Attachments: HBASE-3587.patch Follow-up to a discussion on the dev list: http://search-hadoop.com/m/jOovV1uAJBP The CoprocessorHost ReentrantReadWriteLock is imposing some overhead on data read/write operations, even when no coprocessors are loaded. Currently execution of RegionCoprocessorHost pre/postXXX() methods are guarded by acquiring the coprocessor read lock. This is used to prevent coprocessor registration from modifying the coprocessor collection while upcall hooks are in progress. On further discussion, and looking at the locking in HRegion, it should be sufficient to just use a CopyOnWriteArrayList for the coprocessor collection. We can then remove the coprocessor lock and eliminate the associated overhead without having to special case the no loaded coprocessors condition. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-3587) Eliminate use of ReadWriteLock in RegionObserver coprocessor invocation
[ https://issues.apache.org/jira/browse/HBASE-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Helmling resolved HBASE-3587. -- Resolution: Fixed Fix Version/s: 0.92.0 Committed to trunk Eliminate use of ReadWriteLock in RegionObserver coprocessor invocation --- Key: HBASE-3587 URL: https://issues.apache.org/jira/browse/HBASE-3587 Project: HBase Issue Type: Improvement Components: coprocessors Reporter: Gary Helmling Assignee: Gary Helmling Fix For: 0.92.0 Attachments: HBASE-3587.patch Follow-up to a discussion on the dev list: http://search-hadoop.com/m/jOovV1uAJBP The CoprocessorHost ReentrantReadWriteLock is imposing some overhead on data read/write operations, even when no coprocessors are loaded. Currently execution of RegionCoprocessorHost pre/postXXX() methods are guarded by acquiring the coprocessor read lock. This is used to prevent coprocessor registration from modifying the coprocessor collection while upcall hooks are in progress. On further discussion, and looking at the locking in HRegion, it should be sufficient to just use a CopyOnWriteArrayList for the coprocessor collection. We can then remove the coprocessor lock and eliminate the associated overhead without having to special case the no loaded coprocessors condition. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-3747) [replication] ReplicationSource should differanciate remote and local exceptions
[ https://issues.apache.org/jira/browse/HBASE-3747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Daniel Cryans resolved HBASE-3747. --- Resolution: Fixed Committed to branch and trunk. [replication] ReplicationSource should differanciate remote and local exceptions Key: HBASE-3747 URL: https://issues.apache.org/jira/browse/HBASE-3747 Project: HBase Issue Type: Improvement Affects Versions: 0.90.1 Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Minor Fix For: 0.90.3 Attachments: HBASE-3747.patch From Jeff Whiting on the list: I'm not sure...the key to everything was realizing picking up on RemoteException in Unable to replicate because org.apache.hadoop.ipc.RemoteException and realizing that it was on the replication cluster. If it said something like Unable to replicate. Destination cluster threw an exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: java.lang.RuntimeException: Or something that makes it clear it is an exception on the remote or destination cluster would be helpful. It is easy to scan over org.apache.hadoop.ipc.RemoteException and read it like org.ap...some-kind-of...exception. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HBASE-3734) HBaseAdmin creates new configurations in getCatalogTracker
[ https://issues.apache.org/jira/browse/HBASE-3734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Daniel Cryans reassigned HBASE-3734: - Assignee: Jean-Daniel Cryans HBaseAdmin creates new configurations in getCatalogTracker -- Key: HBASE-3734 URL: https://issues.apache.org/jira/browse/HBASE-3734 Project: HBase Issue Type: Bug Affects Versions: 0.90.1 Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Fix For: 0.90.3 Attachments: HBASE-3734.patch HBaseAdmin.getCatalogTracker creates new Configuration every time it's called, instead HBA should reuse the same one and do the copy inside the constructor. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-3749) Master can't exit when open port failed
Master can't exit when open port failed --- Key: HBASE-3749 URL: https://issues.apache.org/jira/browse/HBASE-3749 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.1 Reporter: gaojinchao When Hmaster crashed and restart , The Hmaster is hung up. // start up all service threads. startServiceThreads(); this open port failed! // Wait for region servers to report in. Returns count of regions. int regionCount = this.serverManager.waitForRegionServers(); // TODO: Should do this in background rather than block master startup this.fileSystemManager. splitLogAfterStartup(this.serverManager.getOnlineServers()); // Make sure root and meta assigned before proceeding. assignRootAndMeta(); --- hung up this function, because of root can't be assigned. if (!catalogTracker.verifyRootRegionLocation(timeout)) { this.assignmentManager.assignRoot(); this.catalogTracker.waitForRoot(); --- This statement code is hung up. assigned++; } Log is as: 2011-04-07 16:38:22,850 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog 2011-04-07 16:38:22,908 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 60010 2011-04-07 16:38:22,909 FATAL org.apache.hadoop.hbase.master.HMaster: Failed startup java.net.BindException: Address already in use at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:119) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216) at org.apache.hadoop.http.HttpServer.start(HttpServer.java:445) at org.apache.hadoop.hbase.master.HMaster.startServiceThreads(HMaster.java:542) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:373) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:278) 2011-04-07 16:38:22,910 INFO org.apache.hadoop.hbase.master.HMaster: Aborting 2011-04-07 16:38:22,911 INFO org.apache.hadoop.hbase.master.ServerManager: Exiting wait on regionserver(s) to checkin; count=0, stopped=true, count of regions out on cluster=0 2011-04-07 16:38:22,914 DEBUG org.apache.hadoop.hbase.master.MasterFileSystem: No log files to split, proceeding... 2011-04-07 16:38:22,930 INFO org.apache.hadoop.ipc.HbaseRPC: Server at 167-6-1-12/167.6.1.12:60020 could not be reached after 1 tries, giving up. 2011-04-07 16:38:22,930 INFO org.apache.hadoop.hbase.catalog.RootLocationEditor: Unsetting ROOT region location in ZooKeeper 2011-04-07 16:38:22,941 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:6-0x22f2c49d2590021 Creating (or updating) unassigned node for 70236052 with OFFLINE state 2011-04-07 16:38:22,956 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Server stopped; skipping assign of -ROOT-,,0.70236052 state=OFFLINE, ts=1302165502941 2011-04-07 16:38:32,746 INFO org.apache.hadoop.hbase.master.AssignmentManager$TimeoutMonitor: 167-6-1-11:6.timeoutMonitor exiting 2011-04-07 16:39:22,770 INFO org.apache.hadoop.hbase.master.LogCleaner: master-167-6-1-11:6.oldLogCleaner exiting -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-3750) HTablePool.putTable() should call table.flushCommits()
HTablePool.putTable() should call table.flushCommits() -- Key: HBASE-3750 URL: https://issues.apache.org/jira/browse/HBASE-3750 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.1 Reporter: Ted Yu Assignee: Ted Yu Currently HTablePool.putTable() doesn't call table.flushCommits() This may turn out to be surprise for users -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3750) HTablePool.putTable() should call table.flushCommits()
[ https://issues.apache.org/jira/browse/HBASE-3750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-3750: -- Attachment: 3750.txt HTablePool.putTable() should call table.flushCommits() -- Key: HBASE-3750 URL: https://issues.apache.org/jira/browse/HBASE-3750 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.1 Reporter: Ted Yu Assignee: Ted Yu Attachments: 3750.txt Currently HTablePool.putTable() doesn't call table.flushCommits() This may turn out to be surprise for users -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-3723) Major compact should be done when there is only one storefile and some keyvalue is outdated.
[ https://issues.apache.org/jira/browse/HBASE-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-3723. -- Resolution: Fixed Hadoop Flags: [Reviewed] Committed to TRUNK. Thank you for the patch Zhou. Major compact should be done when there is only one storefile and some keyvalue is outdated. Key: HBASE-3723 URL: https://issues.apache.org/jira/browse/HBASE-3723 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.0, 0.90.1 Reporter: zhoushuaifeng Fix For: 0.90.2 Attachments: hbase-3723.txt In the function store.isMajorCompaction: if (filesToCompact.size() == 1) { // Single file StoreFile sf = filesToCompact.get(0); long oldest = (sf.getReader().timeRangeTracker == null) ? Long.MIN_VALUE : now - sf.getReader().timeRangeTracker.minimumTimestamp; if (sf.isMajorCompaction() (this.ttl == HConstants.FOREVER || oldest this.ttl)) { if (LOG.isDebugEnabled()) { LOG.debug(Skipping major compaction of + this.storeNameStr + because one (major) compacted file only and oldestTime + oldest + ms is ttl= + this.ttl); } } } else { When there is only one storefile in the store, and some keyvalues' TTL are overtime, the majorcompactchecker should send this region to the compactquene and run a majorcompact to clean these outdated data. But according to the code in 0.90.1, it will do nothing. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-1364) [performance] Distributed splitting of regionserver commit logs
[ https://issues.apache.org/jira/browse/HBASE-1364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016681#comment-13016681 ] Ted Yu commented on HBASE-1364: --- I got the following from TestDistributedLogSplitting: {code} testOrphanLogCreation(org.apache.hadoop.hbase.master.TestDistributedLogSplitting) Time elapsed: 33.203 sec ERROR! java.lang.Exception: Unexpected exception, expectedorg.apache.hadoop.hbase.regionserver.wal.OrphanHLogAfterSplitException but wasjava.lang.Error at org.junit.internal.runners.statements.ExpectException.evaluate(ExpectException.java:28) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184) at org.junit.runners.ParentRunner.run(ParentRunner.java:236) at org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:62) at org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.executeTestSet(AbstractDirectoryTestSuite.java:140) at org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.execute(AbstractDirectoryTestSuite.java:165) at org.apache.maven.surefire.Surefire.run(Surefire.java:107) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.maven.surefire.booter.SurefireBooter.runSuitesInProcess(SurefireBooter.java:289) at org.apache.maven.surefire.booter.SurefireBooter.main(SurefireBooter.java:1005) Caused by: java.lang.Error: Unresolved compilation problem: The method appendNoSync(HRegionInfo, byte[], WALEdit, long) is undefined for the type HLog {code} [performance] Distributed splitting of regionserver commit logs --- Key: HBASE-1364 URL: https://issues.apache.org/jira/browse/HBASE-1364 Project: HBase Issue Type: Improvement Components: coprocessors Reporter: stack Assignee: Prakash Khemani Priority: Critical Fix For: 0.92.0 Attachments: HBASE-1364.patch Time Spent: 8h Remaining Estimate: 0h HBASE-1008 has some improvements to our log splitting on regionserver crash; but it needs to run even faster. (Below is from HBASE-1008) In bigtable paper, the split is distributed. If we're going to have 1000 logs, we need to distribute or at least multithread the splitting. 1. As is, regions starting up expect to find one reconstruction log only. Need to make it so pick up a bunch of edit logs and it should be fine that logs are elsewhere in hdfs in an output directory written by all split participants whether multithreaded or a mapreduce-like distributed process (Lets write our distributed sort first as a MR so we learn whats involved; distributed sort, as much as possible should use MR framework pieces). On startup, regions go to this directory and pick up the files written by split participants deleting and clearing the dir when all have been read in. Making it so can take multiple logs for input, can also make the split process more robust rather than current tenuous process which loses all edits if it doesn't make it to the end without error. 2. Each column family rereads the reconstruction log to find its edits. Need to fix that. Split can sort the edits by column family so store only reads its edits. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3750) HTablePool.putTable() should call table.flushCommits()
[ https://issues.apache.org/jira/browse/HBASE-3750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016685#comment-13016685 ] stack commented on HBASE-3750: -- Do you think this the way to go Ted? How you think people use the HTablePool? I'd think they'd check out an HTable instance, add an edit, then check it back in? If this is the case, then we'll flush each single edit. Or, do you think folks check out HTable instance, keep it around a while putting multiple edits on it and only then check it back in perhaps not using it again? I wonder if a warning up in javadoc that we do NOT flush on return to the pool, so client needs to would be a better way to go? I'm not sure. HTablePool.putTable() should call table.flushCommits() -- Key: HBASE-3750 URL: https://issues.apache.org/jira/browse/HBASE-3750 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.1 Reporter: Ted Yu Assignee: Ted Yu Attachments: 3750.txt Currently HTablePool.putTable() doesn't call table.flushCommits() This may turn out to be surprise for users -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3734) HBaseAdmin creates new configurations in getCatalogTracker
[ https://issues.apache.org/jira/browse/HBASE-3734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016687#comment-13016687 ] stack commented on HBASE-3734: -- Is this the right way to go? The javadoc on #getCatalogTracker says: bq. @return A new CatalogTracker instance; call {@link #cleanupCatalogTracker(CatalogTracker)} to cleanup the returned catalog tracker. Is the problem that cleanupCatalogTracker is not being called? HBaseAdmin creates new configurations in getCatalogTracker -- Key: HBASE-3734 URL: https://issues.apache.org/jira/browse/HBASE-3734 Project: HBase Issue Type: Bug Affects Versions: 0.90.1 Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Fix For: 0.90.3 Attachments: HBASE-3734.patch HBaseAdmin.getCatalogTracker creates new Configuration every time it's called, instead HBA should reuse the same one and do the copy inside the constructor. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3747) [replication] ReplicationSource should differanciate remote and local exceptions
[ https://issues.apache.org/jira/browse/HBASE-3747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016688#comment-13016688 ] stack commented on HBASE-3747: -- +1 [replication] ReplicationSource should differanciate remote and local exceptions Key: HBASE-3747 URL: https://issues.apache.org/jira/browse/HBASE-3747 Project: HBase Issue Type: Improvement Affects Versions: 0.90.1 Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Minor Fix For: 0.90.3 Attachments: HBASE-3747.patch From Jeff Whiting on the list: I'm not sure...the key to everything was realizing picking up on RemoteException in Unable to replicate because org.apache.hadoop.ipc.RemoteException and realizing that it was on the replication cluster. If it said something like Unable to replicate. Destination cluster threw an exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: java.lang.RuntimeException: Or something that makes it clear it is an exception on the remote or destination cluster would be helpful. It is easy to scan over org.apache.hadoop.ipc.RemoteException and read it like org.ap...some-kind-of...exception. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3750) HTablePool.putTable() should call table.flushCommits()
[ https://issues.apache.org/jira/browse/HBASE-3750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-3750: -- Attachment: (was: 3750.txt) HTablePool.putTable() should call table.flushCommits() -- Key: HBASE-3750 URL: https://issues.apache.org/jira/browse/HBASE-3750 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.1 Reporter: Ted Yu Assignee: Ted Yu Attachments: 3750.txt Currently HTablePool.putTable() doesn't call table.flushCommits() This may turn out to be surprise for users -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3750) HTablePool.putTable() should call table.flushCommits()
[ https://issues.apache.org/jira/browse/HBASE-3750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-3750: -- Attachment: 3750.txt Added check for isAutoFlush() before calling flushCommits() HTablePool.putTable() should call table.flushCommits() -- Key: HBASE-3750 URL: https://issues.apache.org/jira/browse/HBASE-3750 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.1 Reporter: Ted Yu Assignee: Ted Yu Attachments: 3750.txt Currently HTablePool.putTable() doesn't call table.flushCommits() This may turn out to be surprise for users -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3744) createTable blocks until all regions are out of transition
[ https://issues.apache.org/jira/browse/HBASE-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016692#comment-13016692 ] stack commented on HBASE-3744: -- BulkStartupAssigner was for bulk assigning on startup. Seems like its been pulled around. Could BulkStartupAssigner be renamed BulkOpener? And BulkDisabler named BulkCloser? I see need of a BulkOpen on regionserver crash (if its not being used already). createTable blocks until all regions are out of transition -- Key: HBASE-3744 URL: https://issues.apache.org/jira/browse/HBASE-3744 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.1 Reporter: Todd Lipcon Assignee: Ted Yu Fix For: 0.92.0 Attachments: 3744.txt In HBASE-3305, the behavior of createTable was changed and introduced this bug: createTable now blocks until all regions have been assigned, since it uses BulkStartupAssigner. BulkStartupAssigner.waitUntilDone calls assignmentManager.waitUntilNoRegionsInTransition, which waits across all regions, not just the regions of the table that has just been created. We saw an issue where one table had a region which was unable to be opened, so it was stuck in RegionsInTransition permanently (every open was failing). Since this was the case, waitUntilDone would always block indefinitely even though the newly created table had been assigned. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3750) HTablePool.putTable() should call table.flushCommits()
[ https://issues.apache.org/jira/browse/HBASE-3750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016691#comment-13016691 ] Ted Yu commented on HBASE-3750: --- Thanks for the review Stack. I attached modified version. I think if the user turns off AutoFlush, we have a reason to flush for them. We can also poll the user mailinglist to see their pattern. Looking at javadoc: {code} * Once you are done with it, return it to the pool with {@link #putTable(HTableInterface)}. {code} I don't think putting a single edit means 'done' with the table instance unless there was really just one edit. HTablePool.putTable() should call table.flushCommits() -- Key: HBASE-3750 URL: https://issues.apache.org/jira/browse/HBASE-3750 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.90.1 Reporter: Ted Yu Assignee: Ted Yu Attachments: 3750.txt Currently HTablePool.putTable() doesn't call table.flushCommits() This may turn out to be surprise for users -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3743) Throttle major compaction
[ https://issues.apache.org/jira/browse/HBASE-3743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016694#comment-13016694 ] stack commented on HBASE-3743: -- As a workaround, we could run a script external to hbase that would first elicted the set of regions in a cluster and then per region, set in motion a major compaction waiting on completion before moving to the next region (Script could check hdfs and count storefiles in the region to figure completion of region major compaction). The script could be run from cron or, as per the painting of the golden gate legend, once we'd gotten to the end of the bridge/table, we would loop around and start in again on the first region, in perpetuum. Throttle major compaction - Key: HBASE-3743 URL: https://issues.apache.org/jira/browse/HBASE-3743 Project: HBase Issue Type: Improvement Components: regionserver Reporter: Joep Rottinghuis Add the ability to throttle major compaction. For those use cases when a stop-the-world approach is not practical, it is useful to be able to throttle the impact that major compaction has on the cluster. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3722) A lot of data is lost when name node crashed
[ https://issues.apache.org/jira/browse/HBASE-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016696#comment-13016696 ] stack commented on HBASE-3722: -- That seems like an harmless addtion. Do you think it would help w/ the issue you saw Gao Jinchao? If so, I can commit. A lot of data is lost when name node crashed - Key: HBASE-3722 URL: https://issues.apache.org/jira/browse/HBASE-3722 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.1 Reporter: gaojinchao Attachments: HmasterFilesystem_PatchV1.patch I'm not sure exactly what arose it. there is some split failed logs . the master should shutdown itself when the HDFS is crashed. The logs is : 2011-03-22 13:21:55,056 WARN org.apache.hadoop.hbase.master.LogCleaner: Error while cleaning the logs java.net.ConnectException: Call to C4C1/157.5.100.1:9000 failed on connection exception: java.net.ConnectException: Connection refused at org.apache.hadoop.ipc.Client.wrapException(Client.java:844) at org.apache.hadoop.ipc.Client.call(Client.java:820) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:221) at $Proxy5.getListing(Unknown Source) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at $Proxy5.getListing(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:614) at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:252) at org.apache.hadoop.hbase.master.LogCleaner.chore(LogCleaner.java:121) at org.apache.hadoop.hbase.Chore.run(Chore.java:66) at org.apache.hadoop.hbase.master.LogCleaner.run(LogCleaner.java:154) Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:408) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:332) at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:202) at org.apache.hadoop.ipc.Client.getConnection(Client.java:943) at org.apache.hadoop.ipc.Client.call(Client.java:788) ... 13 more 2011-03-22 13:21:56,056 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: C4C1/157.5.100.1:9000. Already tried 0 time(s). 2011-03-22 13:21:57,057 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: C4C1/157.5.100.1:9000. Already tried 1 time(s). 2011-03-22 13:21:58,057 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: C4C1/157.5.100.1:9000. Already tried 2 time(s). 2011-03-22 13:21:59,057 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: C4C1/157.5.100.1:9000. Already tried 3 time(s). 2011-03-22 13:22:00,058 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: C4C1/157.5.100.1:9000. Already tried 4 time(s). 2011-03-22 13:22:01,058 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: C4C1/157.5.100.1:9000. Already tried 5 time(s). 2011-03-22 13:22:02,059 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: C4C1/157.5.100.1:9000. Already tried 6 time(s). 2011-03-22 13:22:03,059 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: C4C1/157.5.100.1:9000. Already tried 7 time(s). 2011-03-22 13:22:04,059 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: C4C1/157.5.100.1:9000. Already tried 8 time(s). 2011-03-22 13:22:05,060 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: C4C1/157.5.100.1:9000. Already tried 9 time(s). 2011-03-22 13:22:05,060 ERROR org.apache.hadoop.hbase.master.MasterFileSystem: Failed splitting hdfs://C4C1:9000/hbase/.logs/C4C9.site,60020,1300767633398 java.net.ConnectException: Call to C4C1/157.5.100.1:9000 failed on connection exception: java.net.ConnectException: Connection refused at org.apache.hadoop.ipc.Client.wrapException(Client.java:844) at org.apache.hadoop.ipc.Client.call(Client.java:820) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:221) at $Proxy5.getFileInfo(Unknown Source) at
[jira] [Commented] (HBASE-3729) Get cells via shell with a time range predicate
[ https://issues.apache.org/jira/browse/HBASE-3729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016697#comment-13016697 ] stack commented on HBASE-3729: -- So, it seems like I should get the right behavior if I take your last patch Ted and remove this portion: {code} +elsif args[TIMERANGE] + vers = 3 +else + vers = 1 {code} Do you agree? If so, I'll test it. I'll add note to the help too about how user might want to add VERSIONS 1 when setting TIMERANGE. Get cells via shell with a time range predicate --- Key: HBASE-3729 URL: https://issues.apache.org/jira/browse/HBASE-3729 Project: HBase Issue Type: New Feature Components: shell Reporter: Eric Charles Assignee: Ted Yu Attachments: 3729-v2.txt, 3729-v3.txt, 3729-v4.txt, 3729.txt HBase shell allows to specify a timestamp to get a value - get 't1', 'r1', {COLUMN = 'c1', TIMESTAMP = ts1} If you don't give the exact timestamp, you get nothing... so it's difficult to get the cell previous versions. It would be fine to have a time range predicate based get. The shell syntax could be (depending on technical feasibility) - get 't1', 'r1', {COLUMN = 'c1', TIMERANGE = (start_timestamp, end_timestamp)} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3744) createTable blocks until all regions are out of transition
[ https://issues.apache.org/jira/browse/HBASE-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016698#comment-13016698 ] Ted Yu commented on HBASE-3744: --- BulkOpen isn't used. We also have: {code} class BulkEnabler extends BulkAssigner { {code} createTable blocks until all regions are out of transition -- Key: HBASE-3744 URL: https://issues.apache.org/jira/browse/HBASE-3744 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.90.1 Reporter: Todd Lipcon Assignee: Ted Yu Fix For: 0.92.0 Attachments: 3744.txt In HBASE-3305, the behavior of createTable was changed and introduced this bug: createTable now blocks until all regions have been assigned, since it uses BulkStartupAssigner. BulkStartupAssigner.waitUntilDone calls assignmentManager.waitUntilNoRegionsInTransition, which waits across all regions, not just the regions of the table that has just been created. We saw an issue where one table had a region which was unable to be opened, so it was stuck in RegionsInTransition permanently (every open was failing). Since this was the case, waitUntilDone would always block indefinitely even though the newly created table had been assigned. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira