[jira] [Created] (HBASE-17716) Formalize Scan Metric names
Karan Mehta created HBASE-17716: --- Summary: Formalize Scan Metric names Key: HBASE-17716 URL: https://issues.apache.org/jira/browse/HBASE-17716 Project: HBase Issue Type: Bug Components: metrics Reporter: Karan Mehta Assignee: Karan Mehta Priority: Minor HBase provides various metrics through the API's exposed by ScanMetrics class. The JIRA PHOENIX-3248 requires them to be surfaced through the Phoenix Metrics API. Currently these metrics are referred via hard-coded strings, which are not formal and can break the Phoenix API. Hence we need to refactor the code to assign enums for these metrics. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-17716) Formalize Scan Metric names
[ https://issues.apache.org/jira/browse/HBASE-17716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta updated HBASE-17716: Attachment: HBASE-17716.patch > Formalize Scan Metric names > --- > > Key: HBASE-17716 > URL: https://issues.apache.org/jira/browse/HBASE-17716 > Project: HBase > Issue Type: Bug > Components: metrics >Reporter: Karan Mehta >Assignee: Karan Mehta >Priority: Minor > Attachments: HBASE-17716.patch > > > HBase provides various metrics through the API's exposed by ScanMetrics > class. > The JIRA PHOENIX-3248 requires them to be surfaced through the Phoenix > Metrics API. Currently these metrics are referred via hard-coded strings, > which are not formal and can break the Phoenix API. Hence we need to refactor > the code to assign enums for these metrics. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-17716) Formalize Scan Metric names
[ https://issues.apache.org/jira/browse/HBASE-17716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta updated HBASE-17716: Status: Patch Available (was: Open) > Formalize Scan Metric names > --- > > Key: HBASE-17716 > URL: https://issues.apache.org/jira/browse/HBASE-17716 > Project: HBase > Issue Type: Bug > Components: metrics >Reporter: Karan Mehta >Assignee: Karan Mehta >Priority: Minor > Attachments: HBASE-17716.patch > > > HBase provides various metrics through the API's exposed by ScanMetrics > class. > The JIRA PHOENIX-3248 requires them to be surfaced through the Phoenix > Metrics API. Currently these metrics are referred via hard-coded strings, > which are not formal and can break the Phoenix API. Hence we need to refactor > the code to assign enums for these metrics. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-17698) ReplicationEndpoint choosing sinks
[ https://issues.apache.org/jira/browse/HBASE-17698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta updated HBASE-17698: Attachment: HBASE-17698.patch > ReplicationEndpoint choosing sinks > -- > > Key: HBASE-17698 > URL: https://issues.apache.org/jira/browse/HBASE-17698 > Project: HBase > Issue Type: Bug > Components: Replication >Affects Versions: 2.0.0, 1.4.0 >Reporter: churro morales >Assignee: Karan Mehta > Attachments: HBASE-17698.patch > > > The only time we choose new sinks is when we have a ConnectException, but we > have encountered other exceptions where there is a problem contacting a > particular sink and replication gets backed up for any sources that try that > sink > HBASE-17675 occurred when there was a bad keytab refresh and the source was > stuck. > Another issue we recently had was a bad drive controller on the sink side and > replication was stuck again. > Is there any reason not to choose new sinks anytime we have a > RemoteException? I can understand TableNotFound we don't have to choose new > sinks, but for all other cases this seems like the safest approach. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-17698) ReplicationEndpoint choosing sinks
[ https://issues.apache.org/jira/browse/HBASE-17698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta updated HBASE-17698: Status: Patch Available (was: Open) > ReplicationEndpoint choosing sinks > -- > > Key: HBASE-17698 > URL: https://issues.apache.org/jira/browse/HBASE-17698 > Project: HBase > Issue Type: Bug > Components: Replication >Affects Versions: 2.0.0, 1.4.0 >Reporter: churro morales >Assignee: Karan Mehta > Attachments: HBASE-17698.patch > > > The only time we choose new sinks is when we have a ConnectException, but we > have encountered other exceptions where there is a problem contacting a > particular sink and replication gets backed up for any sources that try that > sink > HBASE-17675 occurred when there was a bad keytab refresh and the source was > stuck. > Another issue we recently had was a bad drive controller on the sink side and > replication was stuck again. > Is there any reason not to choose new sinks anytime we have a > RemoteException? I can understand TableNotFound we don't have to choose new > sinks, but for all other cases this seems like the safest approach. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-17716) Formalize Scan Metric names
[ https://issues.apache.org/jira/browse/HBASE-17716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta updated HBASE-17716: Attachment: HBASE-17716_v2.patch > Formalize Scan Metric names > --- > > Key: HBASE-17716 > URL: https://issues.apache.org/jira/browse/HBASE-17716 > Project: HBase > Issue Type: Bug > Components: metrics >Reporter: Karan Mehta >Assignee: Karan Mehta >Priority: Minor > Attachments: HBASE-17716.patch, HBASE-17716_v2.patch > > > HBase provides various metrics through the API's exposed by ScanMetrics > class. > The JIRA PHOENIX-3248 requires them to be surfaced through the Phoenix > Metrics API. Currently these metrics are referred via hard-coded strings, > which are not formal and can break the Phoenix API. Hence we need to refactor > the code to assign enums for these metrics. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17716) Formalize Scan Metric names
[ https://issues.apache.org/jira/browse/HBASE-17716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15898614#comment-15898614 ] Karan Mehta commented on HBASE-17716: - Added a patch which appends the metric names strings with the suffix "_METRIC_NAME" and made them as public static final. > Formalize Scan Metric names > --- > > Key: HBASE-17716 > URL: https://issues.apache.org/jira/browse/HBASE-17716 > Project: HBase > Issue Type: Bug > Components: metrics >Reporter: Karan Mehta >Assignee: Karan Mehta >Priority: Minor > Attachments: HBASE-17716.patch, HBASE-17716_v2.patch > > > HBase provides various metrics through the API's exposed by ScanMetrics > class. > The JIRA PHOENIX-3248 requires them to be surfaced through the Phoenix > Metrics API. Currently these metrics are referred via hard-coded strings, > which are not formal and can break the Phoenix API. Hence we need to refactor > the code to assign enums for these metrics. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line
[ https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900044#comment-15900044 ] Karan Mehta commented on HBASE-14925: - Yes, I can take it up. > Develop HBase shell command/tool to list table's region info through command > line > - > > Key: HBASE-14925 > URL: https://issues.apache.org/jira/browse/HBASE-14925 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: Romil Choksi >Assignee: huaxiang sun > > I am going through the hbase shell commands to see if there is anything I can > use to get all the regions info just for a particular table. I don’t see any > such command that provides me that information. > It would be better to have a command that provides region info, start key, > end key etc taking a table name as the input parameter. This is available > through HBase UI on clicking on a particular table's link > A tool/shell command to get a list of regions for a table or all tables in a > tabular structured output (that is machine readable) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line
[ https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta reassigned HBASE-14925: --- Assignee: Karan Mehta (was: huaxiang sun) > Develop HBase shell command/tool to list table's region info through command > line > - > > Key: HBASE-14925 > URL: https://issues.apache.org/jira/browse/HBASE-14925 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: Romil Choksi >Assignee: Karan Mehta > > I am going through the hbase shell commands to see if there is anything I can > use to get all the regions info just for a particular table. I don’t see any > such command that provides me that information. > It would be better to have a command that provides region info, start key, > end key etc taking a table name as the input parameter. This is available > through HBase UI on clicking on a particular table's link > A tool/shell command to get a list of regions for a table or all tables in a > tabular structured output (that is machine readable) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line
[ https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta updated HBASE-14925: Attachment: HBASE-14925.patch > Develop HBase shell command/tool to list table's region info through command > line > - > > Key: HBASE-14925 > URL: https://issues.apache.org/jira/browse/HBASE-14925 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: Romil Choksi >Assignee: Karan Mehta > Attachments: HBASE-14925.patch > > > I am going through the hbase shell commands to see if there is anything I can > use to get all the regions info just for a particular table. I don’t see any > such command that provides me that information. > It would be better to have a command that provides region info, start key, > end key etc taking a table name as the input parameter. This is available > through HBase UI on clicking on a particular table's link > A tool/shell command to get a list of regions for a table or all tables in a > tabular structured output (that is machine readable) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line
[ https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta updated HBASE-14925: Status: Patch Available (was: In Progress) > Develop HBase shell command/tool to list table's region info through command > line > - > > Key: HBASE-14925 > URL: https://issues.apache.org/jira/browse/HBASE-14925 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: Romil Choksi >Assignee: Karan Mehta > Attachments: HBASE-14925.patch > > > I am going through the hbase shell commands to see if there is anything I can > use to get all the regions info just for a particular table. I don’t see any > such command that provides me that information. > It would be better to have a command that provides region info, start key, > end key etc taking a table name as the input parameter. This is available > through HBase UI on clicking on a particular table's link > A tool/shell command to get a list of regions for a table or all tables in a > tabular structured output (that is machine readable) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17698) ReplicationEndpoint choosing sinks
[ https://issues.apache.org/jira/browse/HBASE-17698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15926492#comment-15926492 ] Karan Mehta commented on HBASE-17698: - Hey [~apurtell], I have made the changes as suggested. Can you have a look? Thanks! > ReplicationEndpoint choosing sinks > -- > > Key: HBASE-17698 > URL: https://issues.apache.org/jira/browse/HBASE-17698 > Project: HBase > Issue Type: Bug > Components: Replication >Affects Versions: 2.0.0, 1.4.0 >Reporter: churro morales >Assignee: Karan Mehta > Attachments: HBASE-17698.patch > > > The only time we choose new sinks is when we have a ConnectException, but we > have encountered other exceptions where there is a problem contacting a > particular sink and replication gets backed up for any sources that try that > sink > HBASE-17675 occurred when there was a bad keytab refresh and the source was > stuck. > Another issue we recently had was a bad drive controller on the sink side and > replication was stuck again. > Is there any reason not to choose new sinks anytime we have a > RemoteException? I can understand TableNotFound we don't have to choose new > sinks, but for all other cases this seems like the safest approach. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line
[ https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15926502#comment-15926502 ] Karan Mehta commented on HBASE-14925: - Hey [~stack], I have implemented this method as per Ronan's suggestions and submitted the patch. Can you please have a look? Thanks! > Develop HBase shell command/tool to list table's region info through command > line > - > > Key: HBASE-14925 > URL: https://issues.apache.org/jira/browse/HBASE-14925 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: Romil Choksi >Assignee: Karan Mehta > Attachments: HBASE-14925.patch > > > I am going through the hbase shell commands to see if there is anything I can > use to get all the regions info just for a particular table. I don’t see any > such command that provides me that information. > It would be better to have a command that provides region info, start key, > end key etc taking a table name as the input parameter. This is available > through HBase UI on clicking on a particular table's link > A tool/shell command to get a list of regions for a table or all tables in a > tabular structured output (that is machine readable) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line
[ https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15954480#comment-15954480 ] Karan Mehta commented on HBASE-14925: - [~ashish singhi] {code} def command(table_name, region_server_name = "") admin_instance = admin.instance_variable_get("@admin") conn_instance = admin_instance.getConnection() cluster_status = admin_instance.getClusterStatus() hregion_locator_instance = conn_instance.getRegionLocator(TableName.valueOf(table_name)) list = hregion_locator_instance.getAllRegionLocations() results = Array.new begin list.each do |hregion| if hregion.getServerName().toString.start_with? region_server_name startKey = Bytes.toString(hregion.getRegionInfo().getStartKey()) endKey = Bytes.toString(hregion.getRegionInfo().getEndKey()) puts "All = #{hregion} , Start = #{startKey}, End = #{endKey}" end end ensure hregion_locator_instance.close() end {code} How does this code seem? It filters the regions by user provided {{Server Name}} as prefix. > Develop HBase shell command/tool to list table's region info through command > line > - > > Key: HBASE-14925 > URL: https://issues.apache.org/jira/browse/HBASE-14925 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: Romil Choksi >Assignee: Karan Mehta > Attachments: HBASE-14925.patch > > > I am going through the hbase shell commands to see if there is anything I can > use to get all the regions info just for a particular table. I don’t see any > such command that provides me that information. > It would be better to have a command that provides region info, start key, > end key etc taking a table name as the input parameter. This is available > through HBase UI on clicking on a particular table's link > A tool/shell command to get a list of regions for a table or all tables in a > tabular structured output (that is machine readable) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line
[ https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15955753#comment-15955753 ] Karan Mehta commented on HBASE-14925: - Thanks [~rstokes] for pointing out the use cases. The use cases typically suggest that the command will be not be used really often. We typically want the data to be fast, but no specific hard deadlines. This code does consider the fact that if region server is not supplied, then information should be returned for all regions. {{start_with}} would match all the results if provided with an empty string. This patch is actually optimized for queries that just want all the regions of a particular table. For the other part, if a typical table has many regions, then we will have to iterate through each of them and find the relevant ones, which may be expensive if a single table has plenty of regions. The approach is similar to the code that {{table.jsp}} uses to display the information in WEB-UI, the same API's are used. If WEB-UI seems appropriate in terms of latency, this one also should not be, I feel. Please suggest your comments on this one. Moreover, if the following approach is followed as previous patch {code} for server in cluster_status.getServers() for name,region in cluster_status.getLoad(server).getRegionsLoad() region_name = region.getNameAsString() regionStoreFileSize = region.getStorefileSizeMB() regionRequests = region.getRequestsCount() if region_name.start_with? tgtTable results << { "server" => server, "name" => region_name, "size" => regionStoreFileSize, "requests" => regionRequests } end end {code} It is difficult to retrieve the {{EndKey}} of a particular region since {{HRegionInfo}} is not present anywhere. {{StartKey}} is available if we call {{getLoad(ServerName).getRegionsLoad()}}, retrieve the region name and parse it. I believe {{EndKey}} is something that we are required to display as well. This patch can be used if we want to optimize the queries that filter by both tableName and regionServerName since that condition check can be done before the start of second loop. > Develop HBase shell command/tool to list table's region info through command > line > - > > Key: HBASE-14925 > URL: https://issues.apache.org/jira/browse/HBASE-14925 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: Romil Choksi >Assignee: Karan Mehta > Attachments: HBASE-14925.patch > > > I am going through the hbase shell commands to see if there is anything I can > use to get all the regions info just for a particular table. I don’t see any > such command that provides me that information. > It would be better to have a command that provides region info, start key, > end key etc taking a table name as the input parameter. This is available > through HBase UI on clicking on a particular table's link > A tool/shell command to get a list of regions for a table or all tables in a > tabular structured output (that is machine readable) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line
[ https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta updated HBASE-14925: Attachment: HBASE-14925.002.patch > Develop HBase shell command/tool to list table's region info through command > line > - > > Key: HBASE-14925 > URL: https://issues.apache.org/jira/browse/HBASE-14925 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: Romil Choksi >Assignee: Karan Mehta > Attachments: HBASE-14925.002.patch, HBASE-14925.patch > > > I am going through the hbase shell commands to see if there is anything I can > use to get all the regions info just for a particular table. I don’t see any > such command that provides me that information. > It would be better to have a command that provides region info, start key, > end key etc taking a table name as the input parameter. This is available > through HBase UI on clicking on a particular table's link > A tool/shell command to get a list of regions for a table or all tables in a > tabular structured output (that is machine readable) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line
[ https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15955953#comment-15955953 ] Karan Mehta commented on HBASE-14925: - Added a new patch with required changes. Can you take a look [~rstokes] and [~ashish singhi]? > Develop HBase shell command/tool to list table's region info through command > line > - > > Key: HBASE-14925 > URL: https://issues.apache.org/jira/browse/HBASE-14925 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: Romil Choksi >Assignee: Karan Mehta > Attachments: HBASE-14925.002.patch, HBASE-14925.patch > > > I am going through the hbase shell commands to see if there is anything I can > use to get all the regions info just for a particular table. I don’t see any > such command that provides me that information. > It would be better to have a command that provides region info, start key, > end key etc taking a table name as the input parameter. This is available > through HBase UI on clicking on a particular table's link > A tool/shell command to get a list of regions for a table or all tables in a > tabular structured output (that is machine readable) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line
[ https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta updated HBASE-14925: Attachment: HBASE-14925.003.patch > Develop HBase shell command/tool to list table's region info through command > line > - > > Key: HBASE-14925 > URL: https://issues.apache.org/jira/browse/HBASE-14925 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: Romil Choksi >Assignee: Karan Mehta > Attachments: HBASE-14925.002.patch, HBASE-14925.003.patch, > HBASE-14925.patch > > > I am going through the hbase shell commands to see if there is anything I can > use to get all the regions info just for a particular table. I don’t see any > such command that provides me that information. > It would be better to have a command that provides region info, start key, > end key etc taking a table name as the input parameter. This is available > through HBase UI on clicking on a particular table's link > A tool/shell command to get a list of regions for a table or all tables in a > tabular structured output (that is machine readable) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line
[ https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15957799#comment-15957799 ] Karan Mehta commented on HBASE-14925: - {quote} Where are we using this ? @end_time = Time.now {quote} This is to determine the time taken for the command to execute, the {{start_time}} variable is initialized before the call to this function. Added a new patch with proper formatting now. Suggest if there are better ways to format it. The {{formatter.rb}} file doesn't help format if there are more than 2 columns to be displayed. > Develop HBase shell command/tool to list table's region info through command > line > - > > Key: HBASE-14925 > URL: https://issues.apache.org/jira/browse/HBASE-14925 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: Romil Choksi >Assignee: Karan Mehta > Attachments: HBASE-14925.002.patch, HBASE-14925.003.patch, > HBASE-14925.patch > > > I am going through the hbase shell commands to see if there is anything I can > use to get all the regions info just for a particular table. I don’t see any > such command that provides me that information. > It would be better to have a command that provides region info, start key, > end key etc taking a table name as the input parameter. This is available > through HBase UI on clicking on a particular table's link > A tool/shell command to get a list of regions for a table or all tables in a > tabular structured output (that is machine readable) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line
[ https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15957799#comment-15957799 ] Karan Mehta edited comment on HBASE-14925 at 4/5/17 9:46 PM: - {quote} Where are we using this ? @end_time = Time.now {quote} This is to determine the time taken for the command to execute, the {{start_time}} variable is initialized before the call to this function. Added a new patch with proper formatting now. Please suggest if there are better ways to format it. This formatting may be little disrupted if table names are huge or viewing screen size is small, when a single row of data extends up to 2 or more lines. The {{formatter.rb}} file doesn't help format if there are more than 2 columns to be displayed. was (Author: karanmehta93): {quote} Where are we using this ? @end_time = Time.now {quote} This is to determine the time taken for the command to execute, the {{start_time}} variable is initialized before the call to this function. Added a new patch with proper formatting now. Suggest if there are better ways to format it. The {{formatter.rb}} file doesn't help format if there are more than 2 columns to be displayed. > Develop HBase shell command/tool to list table's region info through command > line > - > > Key: HBASE-14925 > URL: https://issues.apache.org/jira/browse/HBASE-14925 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: Romil Choksi >Assignee: Karan Mehta > Attachments: HBASE-14925.002.patch, HBASE-14925.003.patch, > HBASE-14925.patch > > > I am going through the hbase shell commands to see if there is anything I can > use to get all the regions info just for a particular table. I don’t see any > such command that provides me that information. > It would be better to have a command that provides region info, start key, > end key etc taking a table name as the input parameter. This is available > through HBase UI on clicking on a particular table's link > A tool/shell command to get a list of regions for a table or all tables in a > tabular structured output (that is machine readable) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line
[ https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15959491#comment-15959491 ] Karan Mehta commented on HBASE-14925: - bq. I could not find the start_time variable in the patch and also end_time variable value not being used anywhere in the patch. What am I missing ? Can you point me ? Here is the code snippet from the {{commands.rb}} file. The method {{command_safe()}} is a wrapper to the actual command call which initializes the global variable {{start_time}}. It also automatically computes the value of {{end_time}} and displays the total execution time. We can override these values. Since I didn't want the output formatting and display time to be included in the actual execution of the command, I provided a value to that variable before the output stuff. Does that seem okay? {code} #wrap an execution of cmd to catch hbase exceptions # cmd - command name to execute # args - arguments to pass to the command def command_safe(debug, cmd = :command, *args) # Commands can overwrite start_time to skip time used in some kind of setup. # See count.rb for example. @start_time = Time.now # send is internal ruby method to call 'cmd' with *args #(everything is a message, so this is just the formal semantics to support that idiom) translate_hbase_exceptions(*args) { send(cmd, *args) } rescue => e rootCause = e while rootCause != nil && rootCause.respond_to?(:cause) && rootCause.cause != nil rootCause = rootCause.cause end if @shell.interactive? puts puts "ERROR: #{rootCause}" puts "Backtrace: #{rootCause.backtrace.join("\n ")}" if debug puts puts help puts else raise rootCause end ensure # If end_time is not already set by the command, use current time. @end_time ||= Time.now formatter.output_str("Took %.4f seconds" % [@end_time - @start_time]) {code} > Develop HBase shell command/tool to list table's region info through command > line > - > > Key: HBASE-14925 > URL: https://issues.apache.org/jira/browse/HBASE-14925 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: Romil Choksi >Assignee: Karan Mehta > Attachments: HBASE-14925.002.patch, HBASE-14925.003.patch, > HBASE-14925.patch > > > I am going through the hbase shell commands to see if there is anything I can > use to get all the regions info just for a particular table. I don’t see any > such command that provides me that information. > It would be better to have a command that provides region info, start key, > end key etc taking a table name as the input parameter. This is available > through HBase UI on clicking on a particular table's link > A tool/shell command to get a list of regions for a table or all tables in a > tabular structured output (that is machine readable) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-17965) Canary tool should print the regionserver name on failure
[ https://issues.apache.org/jira/browse/HBASE-17965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta updated HBASE-17965: Attachment: HBASE-17965.001.patch > Canary tool should print the regionserver name on failure > - > > Key: HBASE-17965 > URL: https://issues.apache.org/jira/browse/HBASE-17965 > Project: HBase > Issue Type: Task >Reporter: churro morales >Assignee: Karan Mehta >Priority: Minor > Attachments: HBASE-17965.001.patch > > > It would be nice when we have a canary failure for a region to print the > associated regionserver's name in the log as well. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-17965) Canary tool should print the regionserver name on failure
[ https://issues.apache.org/jira/browse/HBASE-17965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta updated HBASE-17965: Status: Patch Available (was: Open) > Canary tool should print the regionserver name on failure > - > > Key: HBASE-17965 > URL: https://issues.apache.org/jira/browse/HBASE-17965 > Project: HBase > Issue Type: Task >Reporter: churro morales >Assignee: Karan Mehta >Priority: Minor > Attachments: HBASE-17965.001.patch > > > It would be nice when we have a canary failure for a region to print the > associated regionserver's name in the log as well. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line
[ https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15985849#comment-15985849 ] Karan Mehta commented on HBASE-14925: - Can you take a final look [~ashish singhi] and suggest changes? > Develop HBase shell command/tool to list table's region info through command > line > - > > Key: HBASE-14925 > URL: https://issues.apache.org/jira/browse/HBASE-14925 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: Romil Choksi >Assignee: Karan Mehta > Attachments: HBASE-14925.002.patch, HBASE-14925.003.patch, > HBASE-14925.patch > > > I am going through the hbase shell commands to see if there is anything I can > use to get all the regions info just for a particular table. I don’t see any > such command that provides me that information. > It would be better to have a command that provides region info, start key, > end key etc taking a table name as the input parameter. This is available > through HBase UI on clicking on a particular table's link > A tool/shell command to get a list of regions for a table or all tables in a > tabular structured output (that is machine readable) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line
[ https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15990017#comment-15990017 ] Karan Mehta commented on HBASE-14925: - Thank you [~enis] and [~busbey] for your comments. I can look into the comments, however I am not completely sure as to how to address them. 1. Will update all usages of Bytes::toString() to Bytes:: toStringBinary() 2. The internal formatter class is not good enough to display it in proper tabular form, so I tried setting the width of column manually to a fixed size. Let me try out some other approach as well. I am not sure how the output would look if the screen size is smaller than the row size. 3. For projecting out specific columns, I can provide the addendum for the users to add extra parameters after the initial params. If the user doesn't provide any column, by default it will display all the data, else it will only display the required ones. Please suggest ways as to how user should provide the input. I believe that the command line argument parser is only able to parse data separated with commas. 4. I am not sure how release notes are handled since I am new to this, but I am happy to learn. Please provide me with relevant info. > Develop HBase shell command/tool to list table's region info through command > line > - > > Key: HBASE-14925 > URL: https://issues.apache.org/jira/browse/HBASE-14925 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: Romil Choksi >Assignee: Karan Mehta > Fix For: 2.0.0, 1.4.0 > > Attachments: HBASE-14925.002.patch, HBASE-14925.003.patch, > HBASE-14925.patch > > > I am going through the hbase shell commands to see if there is anything I can > use to get all the regions info just for a particular table. I don’t see any > such command that provides me that information. > It would be better to have a command that provides region info, start key, > end key etc taking a table name as the input parameter. This is available > through HBase UI on clicking on a particular table's link > A tool/shell command to get a list of regions for a table or all tables in a > tabular structured output (that is machine readable) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17973) Create shell command to identify regions with poor locality
[ https://issues.apache.org/jira/browse/HBASE-17973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15993800#comment-15993800 ] Karan Mehta commented on HBASE-17973: - [~elserj] The patch is committed already but the JIRA status is shown as unresolved. What is its current status? I need to put an addendum for HBASE-14925, but when I try running {{list_regions}} command, I get the error as {code} ERROR: undefined method `filter' for # {code} Can you please look into this? > Create shell command to identify regions with poor locality > --- > > Key: HBASE-17973 > URL: https://issues.apache.org/jira/browse/HBASE-17973 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: Josh Elser >Assignee: Josh Elser > Fix For: 2.0.0 > > Attachments: HBASE-17973.001.patch, HBASE-17973.002.patch, > HBASE-17973.003.patch > > > The data locality of regions often plays a large role in the efficiency of > HBase. Compactions are also expensive to execute, especially on very large > tables. The balancer can do a good job trying to maintain locality (when > tuned properly), but it is not perfect. > This creates a less-than-desirable situation where it's a costly operation to > take a cluster with spotty poor locality (e.g. a small percentage of > regionservers with poor locality). > We already have this information available via the {{ClusterStatus}} proto. > We can easily write a shell command that can present regions which are > lacking a certain percentage of locality. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17973) Create shell command to identify regions with poor locality
[ https://issues.apache.org/jira/browse/HBASE-17973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15993846#comment-15993846 ] Karan Mehta commented on HBASE-17973: - [~elserj] I had a look at the comment earlier but was wondering if there is any internal {{filter}} method available on Arrays since there was none explicitly defined. I didn't understand the function of this line and couldn't find relevant stuff online. Can you please clarify what is this for? {code} + regions = hregion_locator_list.filter do |hregion| {code} Thank you! > Create shell command to identify regions with poor locality > --- > > Key: HBASE-17973 > URL: https://issues.apache.org/jira/browse/HBASE-17973 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: Josh Elser >Assignee: Josh Elser > Fix For: 2.0.0 > > Attachments: HBASE-17973.001.patch, HBASE-17973.002.patch, > HBASE-17973.003.patch > > > The data locality of regions often plays a large role in the efficiency of > HBase. Compactions are also expensive to execute, especially on very large > tables. The balancer can do a good job trying to maintain locality (when > tuned properly), but it is not perfect. > This creates a less-than-desirable situation where it's a costly operation to > take a cluster with spotty poor locality (e.g. a small percentage of > regionservers with poor locality). > We already have this information available via the {{ClusterStatus}} proto. > We can easily write a shell command that can present regions which are > lacking a certain percentage of locality. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17973) Create shell command to identify regions with poor locality
[ https://issues.apache.org/jira/browse/HBASE-17973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15993921#comment-15993921 ] Karan Mehta commented on HBASE-17973: - Thank you for the clarification and the patch, [~elserj]. One more suggestion, could you just attach the patch you committed with this JIRA. It just makes things a little easier I feel. :) > Create shell command to identify regions with poor locality > --- > > Key: HBASE-17973 > URL: https://issues.apache.org/jira/browse/HBASE-17973 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: Josh Elser >Assignee: Josh Elser > Fix For: 2.0.0 > > Attachments: HBASE-17973.001.patch, HBASE-17973.002.patch, > HBASE-17973.003.patch > > > The data locality of regions often plays a large role in the efficiency of > HBase. Compactions are also expensive to execute, especially on very large > tables. The balancer can do a good job trying to maintain locality (when > tuned properly), but it is not perfect. > This creates a less-than-desirable situation where it's a costly operation to > take a cluster with spotty poor locality (e.g. a small percentage of > regionservers with poor locality). > We already have this information available via the {{ClusterStatus}} proto. > We can easily write a shell command that can present regions which are > lacking a certain percentage of locality. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17973) Create shell command to identify regions with poor locality
[ https://issues.apache.org/jira/browse/HBASE-17973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15994221#comment-15994221 ] Karan Mehta commented on HBASE-17973: - Yes, it is there. Sorry for that. I didn't observe that v3 was an addendum and not a complete patch. Thanks! > Create shell command to identify regions with poor locality > --- > > Key: HBASE-17973 > URL: https://issues.apache.org/jira/browse/HBASE-17973 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: Josh Elser >Assignee: Josh Elser > Fix For: 2.0.0 > > Attachments: HBASE-17973.001.patch, HBASE-17973.002.patch, > HBASE-17973.003.patch > > > The data locality of regions often plays a large role in the efficiency of > HBase. Compactions are also expensive to execute, especially on very large > tables. The balancer can do a good job trying to maintain locality (when > tuned properly), but it is not perfect. > This creates a less-than-desirable situation where it's a costly operation to > take a cluster with spotty poor locality (e.g. a small percentage of > regionservers with poor locality). > We already have this information available via the {{ClusterStatus}} proto. > We can easily write a shell command that can present regions which are > lacking a certain percentage of locality. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line
[ https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta updated HBASE-14925: Status: Patch Available (was: Reopened) > Develop HBase shell command/tool to list table's region info through command > line > - > > Key: HBASE-14925 > URL: https://issues.apache.org/jira/browse/HBASE-14925 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: Romil Choksi >Assignee: Karan Mehta > Fix For: 2.0.0, 1.4.0 > > Attachments: HBASE-14925.002.patch, > HBASE-14925.003.addendum.001.patch, HBASE-14925.003.patch, HBASE-14925.patch > > > I am going through the hbase shell commands to see if there is anything I can > use to get all the regions info just for a particular table. I don’t see any > such command that provides me that information. > It would be better to have a command that provides region info, start key, > end key etc taking a table name as the input parameter. This is available > through HBase UI on clicking on a particular table's link > A tool/shell command to get a list of regions for a table or all tables in a > tabular structured output (that is machine readable) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line
[ https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta updated HBASE-14925: Attachment: HBASE-14925.003.addendum.001.patch > Develop HBase shell command/tool to list table's region info through command > line > - > > Key: HBASE-14925 > URL: https://issues.apache.org/jira/browse/HBASE-14925 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: Romil Choksi >Assignee: Karan Mehta > Fix For: 2.0.0, 1.4.0 > > Attachments: HBASE-14925.002.patch, > HBASE-14925.003.addendum.001.patch, HBASE-14925.003.patch, HBASE-14925.patch > > > I am going through the hbase shell commands to see if there is anything I can > use to get all the regions info just for a particular table. I don’t see any > such command that provides me that information. > It would be better to have a command that provides region info, start key, > end key etc taking a table name as the input parameter. This is available > through HBase UI on clicking on a particular table's link > A tool/shell command to get a list of regions for a table or all tables in a > tabular structured output (that is machine readable) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line
[ https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta updated HBASE-14925: Release Note: Added a shell command 'list_regions' for displaying the table's region info through command line. List all regions for a particular table as an array and also filter them by server name (optional) as prefix and maximum locality (optional). By default, it will return all the regions for the table with any locality. The command displays server name, region name, start key, end key, size of the region in MB, number of requests and the locality. The information can be projected out via an array as third parameter. By default all these information is displayed. Possible array values are SERVER_NAME, REGION_NAME, START_KEY, END_KEY, SIZE, REQ and LOCALITY. Values are not case sensitive. If you don't want to filter by server name, pass an empty hash / string as shown below. Examples: hbase> list_regions 'table_name' hbase> list_regions 'table_name', 'server_name' hbase> list_regions 'table_name', {SERVER_NAME => 'server_name', LOCALITY_THRESHOLD => 0.8} hbase> list_regions 'table_name', {SERVER_NAME => 'server_name', LOCALITY_THRESHOLD => 0.8}, ['SERVER_NAME'] hbase> list_regions 'table_name', {}, ['SERVER_NAME', 'start_key'] hbase> list_regions 'table_name', '', ['SERVER_NAME', 'start_key'] > Develop HBase shell command/tool to list table's region info through command > line > - > > Key: HBASE-14925 > URL: https://issues.apache.org/jira/browse/HBASE-14925 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: Romil Choksi >Assignee: Karan Mehta > Fix For: 2.0.0, 1.4.0 > > Attachments: HBASE-14925.002.patch, > HBASE-14925.003.addendum.001.patch, HBASE-14925.003.patch, HBASE-14925.patch > > > I am going through the hbase shell commands to see if there is anything I can > use to get all the regions info just for a particular table. I don’t see any > such command that provides me that information. > It would be better to have a command that provides region info, start key, > end key etc taking a table name as the input parameter. This is available > through HBase UI on clicking on a particular table's link > A tool/shell command to get a list of regions for a table or all tables in a > tabular structured output (that is machine readable) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line
[ https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15995664#comment-15995664 ] Karan Mehta commented on HBASE-14925: - I have put up an addendum patch and added the release notes. Can you have a look and provide your comments? > Develop HBase shell command/tool to list table's region info through command > line > - > > Key: HBASE-14925 > URL: https://issues.apache.org/jira/browse/HBASE-14925 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: Romil Choksi >Assignee: Karan Mehta > Fix For: 2.0.0, 1.4.0 > > Attachments: HBASE-14925.002.patch, > HBASE-14925.003.addendum.001.patch, HBASE-14925.003.patch, HBASE-14925.patch > > > I am going through the hbase shell commands to see if there is anything I can > use to get all the regions info just for a particular table. I don’t see any > such command that provides me that information. > It would be better to have a command that provides region info, start key, > end key etc taking a table name as the input parameter. This is available > through HBase UI on clicking on a particular table's link > A tool/shell command to get a list of regions for a table or all tables in a > tabular structured output (that is machine readable) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-17998) Improve HBase RPC write throttling size estimation
Karan Mehta created HBASE-17998: --- Summary: Improve HBase RPC write throttling size estimation Key: HBASE-17998 URL: https://issues.apache.org/jira/browse/HBASE-17998 Project: HBase Issue Type: Improvement Reporter: Karan Mehta Assignee: Karan Mehta Currently when RPC throttling, the size of each put is estimated using a hardcoded value 100 bytes. This can be improved by using the protobuf size as an estimate, without having to deserialize or do a big refactoring. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18000) Make sure we always return the scanner id with ScanResponse
[ https://issues.apache.org/jira/browse/HBASE-18000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998528#comment-15998528 ] Karan Mehta commented on HBASE-18000: - Isn't the patch being applied to 1.3.1 version as well? > Make sure we always return the scanner id with ScanResponse > --- > > Key: HBASE-18000 > URL: https://issues.apache.org/jira/browse/HBASE-18000 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0, 1.4.0, 1.3.1 >Reporter: Lars Hofhansl >Assignee: Duo Zhang > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18000.patch > > > Some external tooling (like OpenTSDB) relies on the scanner id to tie > asynchronous responses back to their requests. > (see comments on HBASE-17489) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line
[ https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998960#comment-15998960 ] Karan Mehta commented on HBASE-14925: - Thank you for committing, [~ashish singhi] I will provide the addendum for branch-1 after I get a patch for HBASE-17973 for branch-1 since it is dependent on it, or else there will be merge conflicts. > Develop HBase shell command/tool to list table's region info through command > line > - > > Key: HBASE-14925 > URL: https://issues.apache.org/jira/browse/HBASE-14925 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: Romil Choksi >Assignee: Karan Mehta > Fix For: 2.0.0, 1.4.0 > > Attachments: HBASE-14925.002.patch, > HBASE-14925.003.addendum.001.patch, HBASE-14925.003.patch, HBASE-14925.patch > > > I am going through the hbase shell commands to see if there is anything I can > use to get all the regions info just for a particular table. I don’t see any > such command that provides me that information. > It would be better to have a command that provides region info, start key, > end key etc taking a table name as the input parameter. This is available > through HBase UI on clicking on a particular table's link > A tool/shell command to get a list of regions for a table or all tables in a > tabular structured output (that is machine readable) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17973) Create shell command to identify regions with poor locality
[ https://issues.apache.org/jira/browse/HBASE-17973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998962#comment-15998962 ] Karan Mehta commented on HBASE-17973: - Hey [~elserj], Can you back-port the patch for {{branch-1}}? The addendum patch for HBASE-14925 (branch-1) is dependent on this. Thanks! > Create shell command to identify regions with poor locality > --- > > Key: HBASE-17973 > URL: https://issues.apache.org/jira/browse/HBASE-17973 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: Josh Elser >Assignee: Josh Elser > Fix For: 2.0.0 > > Attachments: HBASE-17973.001.patch, HBASE-17973.002.patch, > HBASE-17973.003.patch > > > The data locality of regions often plays a large role in the efficiency of > HBase. Compactions are also expensive to execute, especially on very large > tables. The balancer can do a good job trying to maintain locality (when > tuned properly), but it is not perfect. > This creates a less-than-desirable situation where it's a costly operation to > take a cluster with spotty poor locality (e.g. a small percentage of > regionservers with poor locality). > We already have this information available via the {{ClusterStatus}} proto. > We can easily write a shell command that can present regions which are > lacking a certain percentage of locality. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-17973) Create shell command to identify regions with poor locality
[ https://issues.apache.org/jira/browse/HBASE-17973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta updated HBASE-17973: Attachment: HBASE-17973.branch-1.001.patch > Create shell command to identify regions with poor locality > --- > > Key: HBASE-17973 > URL: https://issues.apache.org/jira/browse/HBASE-17973 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: Josh Elser >Assignee: Josh Elser > Fix For: 2.0.0 > > Attachments: HBASE-17973.001.patch, HBASE-17973.002.patch, > HBASE-17973.003.patch, HBASE-17973.branch-1.001.patch > > > The data locality of regions often plays a large role in the efficiency of > HBase. Compactions are also expensive to execute, especially on very large > tables. The balancer can do a good job trying to maintain locality (when > tuned properly), but it is not perfect. > This creates a less-than-desirable situation where it's a costly operation to > take a cluster with spotty poor locality (e.g. a small percentage of > regionservers with poor locality). > We already have this information available via the {{ClusterStatus}} proto. > We can easily write a shell command that can present regions which are > lacking a certain percentage of locality. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17973) Create shell command to identify regions with poor locality
[ https://issues.apache.org/jira/browse/HBASE-17973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15999029#comment-15999029 ] Karan Mehta commented on HBASE-17973: - [~apurtell] I have added a patch for branch-1. I have squashed all the patches and the addendum into a single patch. Hope its fine that way. Please review whenever convenient. > Create shell command to identify regions with poor locality > --- > > Key: HBASE-17973 > URL: https://issues.apache.org/jira/browse/HBASE-17973 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: Josh Elser >Assignee: Josh Elser > Fix For: 2.0.0 > > Attachments: HBASE-17973.001.patch, HBASE-17973.002.patch, > HBASE-17973.003.patch, HBASE-17973.branch-1.001.patch > > > The data locality of regions often plays a large role in the efficiency of > HBase. Compactions are also expensive to execute, especially on very large > tables. The balancer can do a good job trying to maintain locality (when > tuned properly), but it is not perfect. > This creates a less-than-desirable situation where it's a costly operation to > take a cluster with spotty poor locality (e.g. a small percentage of > regionservers with poor locality). > We already have this information available via the {{ClusterStatus}} proto. > We can easily write a shell command that can present regions which are > lacking a certain percentage of locality. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line
[ https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15999042#comment-15999042 ] Karan Mehta commented on HBASE-14925: - The addendum patch attached here will cleanly apply to {{branch-1}} after the {{branch-1}} patch for HBASE-17973 is committed. > Develop HBase shell command/tool to list table's region info through command > line > - > > Key: HBASE-14925 > URL: https://issues.apache.org/jira/browse/HBASE-14925 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: Romil Choksi >Assignee: Karan Mehta > Fix For: 2.0.0, 1.4.0 > > Attachments: HBASE-14925.002.patch, > HBASE-14925.003.addendum.001.patch, HBASE-14925.003.patch, HBASE-14925.patch > > > I am going through the hbase shell commands to see if there is anything I can > use to get all the regions info just for a particular table. I don’t see any > such command that provides me that information. > It would be better to have a command that provides region info, start key, > end key etc taking a table name as the input parameter. This is available > through HBase UI on clicking on a particular table's link > A tool/shell command to get a list of regions for a table or all tables in a > tabular structured output (that is machine readable) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17998) Improve HBase RPC write throttling size estimation
[ https://issues.apache.org/jira/browse/HBASE-17998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16001881#comment-16001881 ] Karan Mehta commented on HBASE-17998: - I did not find really feasible for this thing to happen because of following reasons. 1. Calling {{getSerializedSize()}} on a Protobuf to estimate its size will require to traverse over the complete protobuf data which we don't wanna do in the first place. 2. Size of Puts can be estimated based on the number of bytes received for the RPC Request. This information can be passed around with {{HBaseRpcController}} class. However, it is difficult to estimate the size of only {{Puts}} in case of a {{MultiRequest}} since we just have a total buffer size which may include {{Gets}} and {{Scans}}. I am not sure if we should really be doing code changes for just this support. Rather than randomly estimating the size, a slightly better approach might be to use a moving average based on the past requests that have been seen as suggested by [~vincentpoon]. Lets discuss if there are any other ways to accomplish this task. > Improve HBase RPC write throttling size estimation > -- > > Key: HBASE-17998 > URL: https://issues.apache.org/jira/browse/HBASE-17998 > Project: HBase > Issue Type: Improvement >Reporter: Karan Mehta >Assignee: Karan Mehta > > Currently when RPC throttling, the size of each put is estimated using a > hardcoded value 100 bytes. This can be improved by using the protobuf size as > an estimate, without having to deserialize or do a big refactoring. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Reopened] (HBASE-18000) Make sure we always return the scanner id with ScanResponse
[ https://issues.apache.org/jira/browse/HBASE-18000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta reopened HBASE-18000: - > Make sure we always return the scanner id with ScanResponse > --- > > Key: HBASE-18000 > URL: https://issues.apache.org/jira/browse/HBASE-18000 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0, 1.4.0, 1.3.1 >Reporter: Lars Hofhansl >Assignee: Duo Zhang > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18000.patch > > > Some external tooling (like OpenTSDB) relies on the scanner id to tie > asynchronous responses back to their requests. > (see comments on HBASE-17489) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18000) Make sure we always return the scanner id with ScanResponse
[ https://issues.apache.org/jira/browse/HBASE-18000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16005665#comment-16005665 ] Karan Mehta commented on HBASE-18000: - [~Apache9] This is a corner case. I feel that the bug is not fully resolved. We have set the {{scannerId}} on the {{ScanResponse}} builder {code} if (request.hasScannerId()) { rsh = getRegionScanner(request); isSmallScan = false; // The downstream projects such as AsyncHBase in OpenTSDB need this value. See HBASE-18000 // for more details. builder.setScannerId(request.getScannerId()); } else { {code} and {{rsh = getRegionScanner(request);}} look like this {code} if (request.hasCloseScanner() && request.getCloseScanner()) { throw SCANNER_ALREADY_CLOSED; } else { {code} which implies that for a {{CloseScannerRequest}} an exception is thrown by this line. Thus, {{builder.setScannerId(request.getScannerId())}} is never executed. We thus send an empty {{ScanResponse}} for handling it as follows. {code} if (e == SCANNER_ALREADY_CLOSED) { // Now we will close scanner automatically if there are no more results for this region but // the old client will still send a close request to us. Just ignore it and return. return builder.build(); } {code} Thus there is no {{scannerId}} added in the {{builder}}. A simple possible fix is to do this. Please suggest. {code} if (request.hasScannerId()) { -rsh = getRegionScanner(request); -isSmallScan = false; // The downstream projects such as AsyncHBase in OpenTSDB need this value. See HBASE-18000 // for more details. builder.setScannerId(request.getScannerId()); +rsh = getRegionScanner(request); +isSmallScan = false; } else { {code} > Make sure we always return the scanner id with ScanResponse > --- > > Key: HBASE-18000 > URL: https://issues.apache.org/jira/browse/HBASE-18000 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0, 1.4.0, 1.3.1 >Reporter: Lars Hofhansl >Assignee: Duo Zhang > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18000.patch > > > Some external tooling (like OpenTSDB) relies on the scanner id to tie > asynchronous responses back to their requests. > (see comments on HBASE-17489) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-18042) Client Compatibility breaks between versions 1.2 and 1.3
Karan Mehta created HBASE-18042: --- Summary: Client Compatibility breaks between versions 1.2 and 1.3 Key: HBASE-18042 URL: https://issues.apache.org/jira/browse/HBASE-18042 Project: HBase Issue Type: Bug Reporter: Karan Mehta OpenTSDB uses AsyncHBase as its client, rather than using the traditional HBase Client. From version 1.2 to 1.3, the {{ClientProtos}} have been changed. Newer fields are added to {{ScanResponse}} proto. For a typical Scan request in 1.2, would require caller to make an OpenScanner Request, GetNextRows Request and a CloseScanner Request, based on {{more_rows}} boolean field in the {{ScanResponse}} proto. However, from 1.3, new parameter {{more_results_in_region}} was added, which limits the results per region. Therefore the client has to now manage sending all the requests for each region. Further more, if the results are exhausted from a particular region, the {{ScanResponse}} will set {{more_results_in_region}} to false, but {{more_results}} can still be true. Whenever the former is set to false, the {{RegionScanner}} will also be closed. OpenTSDB makes an OpenScanner Request and receives all its results in the first {{ScanResponse}} itself, thus creating a condition as described in above paragraph. Since {{more_rows}} is true, it will proceed to send next request at which point the {{RSRpcServices}} will throw {{UnknownScannerException}}. The protobuf client compatibility is maintained but expected behavior is modified. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18042) Client Compatibility breaks between versions 1.2 and 1.3
[ https://issues.apache.org/jira/browse/HBASE-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008497#comment-16008497 ] Karan Mehta commented on HBASE-18042: - A simple solution for this is to not close the scanner even if there are no more results in region. I am not sure about other implications due to this though. {{RSRpcServices.java}} {code} addResults(builder, results, (PayloadCarryingRpcController) controller, RegionReplicaUtil.isDefaultReplica(region.getRegionInfo())); - if (!moreResults || !moreResultsInRegion || closeScanner) { + if (!moreResults || closeScanner) { scannerClosed = true; closeScanner(region, scanner, scannerName, context); {code} > Client Compatibility breaks between versions 1.2 and 1.3 > > > Key: HBASE-18042 > URL: https://issues.apache.org/jira/browse/HBASE-18042 > Project: HBase > Issue Type: Bug >Reporter: Karan Mehta > > OpenTSDB uses AsyncHBase as its client, rather than using the traditional > HBase Client. From version 1.2 to 1.3, the {{ClientProtos}} have been > changed. Newer fields are added to {{ScanResponse}} proto. > For a typical Scan request in 1.2, would require caller to make an > OpenScanner Request, GetNextRows Request and a CloseScanner Request, based on > {{more_rows}} boolean field in the {{ScanResponse}} proto. > However, from 1.3, new parameter {{more_results_in_region}} was added, which > limits the results per region. Therefore the client has to now manage sending > all the requests for each region. Further more, if the results are exhausted > from a particular region, the {{ScanResponse}} will set > {{more_results_in_region}} to false, but {{more_results}} can still be true. > Whenever the former is set to false, the {{RegionScanner}} will also be > closed. > OpenTSDB makes an OpenScanner Request and receives all its results in the > first {{ScanResponse}} itself, thus creating a condition as described in > above paragraph. Since {{more_rows}} is true, it will proceed to send next > request at which point the {{RSRpcServices}} will throw > {{UnknownScannerException}}. The protobuf client compatibility is maintained > but expected behavior is modified. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HBASE-18042) Client Compatibility breaks between versions 1.2 and 1.3
[ https://issues.apache.org/jira/browse/HBASE-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta reassigned HBASE-18042: --- Assignee: Karan Mehta > Client Compatibility breaks between versions 1.2 and 1.3 > > > Key: HBASE-18042 > URL: https://issues.apache.org/jira/browse/HBASE-18042 > Project: HBase > Issue Type: Bug >Reporter: Karan Mehta >Assignee: Karan Mehta > > OpenTSDB uses AsyncHBase as its client, rather than using the traditional > HBase Client. From version 1.2 to 1.3, the {{ClientProtos}} have been > changed. Newer fields are added to {{ScanResponse}} proto. > For a typical Scan request in 1.2, would require caller to make an > OpenScanner Request, GetNextRows Request and a CloseScanner Request, based on > {{more_rows}} boolean field in the {{ScanResponse}} proto. > However, from 1.3, new parameter {{more_results_in_region}} was added, which > limits the results per region. Therefore the client has to now manage sending > all the requests for each region. Further more, if the results are exhausted > from a particular region, the {{ScanResponse}} will set > {{more_results_in_region}} to false, but {{more_results}} can still be true. > Whenever the former is set to false, the {{RegionScanner}} will also be > closed. > OpenTSDB makes an OpenScanner Request and receives all its results in the > first {{ScanResponse}} itself, thus creating a condition as described in > above paragraph. Since {{more_rows}} is true, it will proceed to send next > request at which point the {{RSRpcServices}} will throw > {{UnknownScannerException}}. The protobuf client compatibility is maintained > but expected behavior is modified. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18042) Client Compatibility breaks between versions 1.2 and 1.3
[ https://issues.apache.org/jira/browse/HBASE-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta updated HBASE-18042: Affects Version/s: 1.3.1 > Client Compatibility breaks between versions 1.2 and 1.3 > > > Key: HBASE-18042 > URL: https://issues.apache.org/jira/browse/HBASE-18042 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.1 >Reporter: Karan Mehta >Assignee: Karan Mehta > > OpenTSDB uses AsyncHBase as its client, rather than using the traditional > HBase Client. From version 1.2 to 1.3, the {{ClientProtos}} have been > changed. Newer fields are added to {{ScanResponse}} proto. > For a typical Scan request in 1.2, would require caller to make an > OpenScanner Request, GetNextRows Request and a CloseScanner Request, based on > {{more_rows}} boolean field in the {{ScanResponse}} proto. > However, from 1.3, new parameter {{more_results_in_region}} was added, which > limits the results per region. Therefore the client has to now manage sending > all the requests for each region. Further more, if the results are exhausted > from a particular region, the {{ScanResponse}} will set > {{more_results_in_region}} to false, but {{more_results}} can still be true. > Whenever the former is set to false, the {{RegionScanner}} will also be > closed. > OpenTSDB makes an OpenScanner Request and receives all its results in the > first {{ScanResponse}} itself, thus creating a condition as described in > above paragraph. Since {{more_rows}} is true, it will proceed to send next > request at which point the {{RSRpcServices}} will throw > {{UnknownScannerException}}. The protobuf client compatibility is maintained > but expected behavior is modified. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18042) Client Compatibility breaks between versions 1.2 and 1.3
[ https://issues.apache.org/jira/browse/HBASE-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16009182#comment-16009182 ] Karan Mehta commented on HBASE-18042: - HBASE-17489 > Client Compatibility breaks between versions 1.2 and 1.3 > > > Key: HBASE-18042 > URL: https://issues.apache.org/jira/browse/HBASE-18042 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.1 >Reporter: Karan Mehta >Assignee: Karan Mehta > > OpenTSDB uses AsyncHBase as its client, rather than using the traditional > HBase Client. From version 1.2 to 1.3, the {{ClientProtos}} have been > changed. Newer fields are added to {{ScanResponse}} proto. > For a typical Scan request in 1.2, would require caller to make an > OpenScanner Request, GetNextRows Request and a CloseScanner Request, based on > {{more_rows}} boolean field in the {{ScanResponse}} proto. > However, from 1.3, new parameter {{more_results_in_region}} was added, which > limits the results per region. Therefore the client has to now manage sending > all the requests for each region. Further more, if the results are exhausted > from a particular region, the {{ScanResponse}} will set > {{more_results_in_region}} to false, but {{more_results}} can still be true. > Whenever the former is set to false, the {{RegionScanner}} will also be > closed. > OpenTSDB makes an OpenScanner Request and receives all its results in the > first {{ScanResponse}} itself, thus creating a condition as described in > above paragraph. Since {{more_rows}} is true, it will proceed to send next > request at which point the {{RSRpcServices}} will throw > {{UnknownScannerException}}. The protobuf client compatibility is maintained > but expected behavior is modified. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18042) Client Compatibility breaks between versions 1.2 and 1.3
[ https://issues.apache.org/jira/browse/HBASE-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16011044#comment-16011044 ] Karan Mehta commented on HBASE-18042: - bq. We already have this field for branch-1.2. I missed that. Sorry about it. But the behavior of {{closeScanner}} is modified between 1.2 and 1.3. Although we set {{more_results_in_region}} bit, we never close the scanner based on that. I am not sure about how the golden behavior should be in such a case. bq. IIRC, the problem is that the official hbase client implementation does not handle {{more_results_in_region}} correctly which leads to one more request to RS but get nothing. I think this is a bug? Can you provide some more insight into this? I am not fully aware about this. From what I understand, for 1.3, if there is an OpenScanRequest for a region which returns all the results, then setting the {{more_results_in_region}} bit should help client to save one RPC request for CloseScanner since the scanner will be automatically closed on the server side. We do not get this advantage in 1.2. If any external client reads the {{more_results_in_region}} bit and doesn't send out the CloseScanRequest. then an open scanner will be lying around on server side wasting the resources. > Client Compatibility breaks between versions 1.2 and 1.3 > > > Key: HBASE-18042 > URL: https://issues.apache.org/jira/browse/HBASE-18042 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.1 >Reporter: Karan Mehta >Assignee: Karan Mehta > > OpenTSDB uses AsyncHBase as its client, rather than using the traditional > HBase Client. From version 1.2 to 1.3, the {{ClientProtos}} have been > changed. Newer fields are added to {{ScanResponse}} proto. > For a typical Scan request in 1.2, would require caller to make an > OpenScanner Request, GetNextRows Request and a CloseScanner Request, based on > {{more_rows}} boolean field in the {{ScanResponse}} proto. > However, from 1.3, new parameter {{more_results_in_region}} was added, which > limits the results per region. Therefore the client has to now manage sending > all the requests for each region. Further more, if the results are exhausted > from a particular region, the {{ScanResponse}} will set > {{more_results_in_region}} to false, but {{more_results}} can still be true. > Whenever the former is set to false, the {{RegionScanner}} will also be > closed. > OpenTSDB makes an OpenScanner Request and receives all its results in the > first {{ScanResponse}} itself, thus creating a condition as described in > above paragraph. Since {{more_rows}} is true, it will proceed to send next > request at which point the {{RSRpcServices}} will throw > {{UnknownScannerException}}. The protobuf client compatibility is maintained > but expected behavior is modified. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18042) Client Compatibility breaks between versions 1.2 and 1.3
[ https://issues.apache.org/jira/browse/HBASE-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16011147#comment-16011147 ] Karan Mehta commented on HBASE-18042: - bq. So, more_rows was used to indicate more results in the region or more results overall in 1.2? {{more_rows}} has always been used to indicate if there are any results pending overall. bq. It may just be a bug in opentsdb itself to check more_results_in_region as well as more_rows. I understand that OpenTSDB should handle {{more_results_in_region}}, but can we expect such a client-side behavior change between two minor releases? OpenTSDB works fine with 1.2 but not with 1.3. [~apurtell] > Client Compatibility breaks between versions 1.2 and 1.3 > > > Key: HBASE-18042 > URL: https://issues.apache.org/jira/browse/HBASE-18042 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.1 >Reporter: Karan Mehta >Assignee: Karan Mehta > > OpenTSDB uses AsyncHBase as its client, rather than using the traditional > HBase Client. From version 1.2 to 1.3, the {{ClientProtos}} have been > changed. Newer fields are added to {{ScanResponse}} proto. > For a typical Scan request in 1.2, would require caller to make an > OpenScanner Request, GetNextRows Request and a CloseScanner Request, based on > {{more_rows}} boolean field in the {{ScanResponse}} proto. > However, from 1.3, new parameter {{more_results_in_region}} was added, which > limits the results per region. Therefore the client has to now manage sending > all the requests for each region. Further more, if the results are exhausted > from a particular region, the {{ScanResponse}} will set > {{more_results_in_region}} to false, but {{more_results}} can still be true. > Whenever the former is set to false, the {{RegionScanner}} will also be > closed. > OpenTSDB makes an OpenScanner Request and receives all its results in the > first {{ScanResponse}} itself, thus creating a condition as described in > above paragraph. Since {{more_rows}} is true, it will proceed to send next > request at which point the {{RSRpcServices}} will throw > {{UnknownScannerException}}. The protobuf client compatibility is maintained > but expected behavior is modified. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18042) Client Compatibility breaks between versions 1.2 and 1.3
[ https://issues.apache.org/jira/browse/HBASE-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16012609#comment-16012609 ] Karan Mehta commented on HBASE-18042: - bq. But it is ok to get any exception while closing a scanner because the ClientScanner will catch any exception from server in close(). There will be lot of unnecessary exceptions being generated and routed to the Client, which is not good. OpenTSDB does lot of RPC calls for fetching metrics data and it sometimes results in being winded up infinitely in these exceptions. If it's a single query then it might just work fine. Sometimes it returns duplicate results, since it sends out multiple OpenScannerRequests after getting confused by the exception. bq. I guess the problem for AsyncHBase is that it will also fetch data when opening a scanner, just like what we do in the new code in master and branch-1? Yes that is true. See description for the exact behavior. > Client Compatibility breaks between versions 1.2 and 1.3 > > > Key: HBASE-18042 > URL: https://issues.apache.org/jira/browse/HBASE-18042 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.1 >Reporter: Karan Mehta >Assignee: Karan Mehta > > OpenTSDB uses AsyncHBase as its client, rather than using the traditional > HBase Client. From version 1.2 to 1.3, the {{ClientProtos}} have been > changed. Newer fields are added to {{ScanResponse}} proto. > For a typical Scan request in 1.2, would require caller to make an > OpenScanner Request, GetNextRows Request and a CloseScanner Request, based on > {{more_rows}} boolean field in the {{ScanResponse}} proto. > However, from 1.3, new parameter {{more_results_in_region}} was added, which > limits the results per region. Therefore the client has to now manage sending > all the requests for each region. Further more, if the results are exhausted > from a particular region, the {{ScanResponse}} will set > {{more_results_in_region}} to false, but {{more_results}} can still be true. > Whenever the former is set to false, the {{RegionScanner}} will also be > closed. > OpenTSDB makes an OpenScanner Request and receives all its results in the > first {{ScanResponse}} itself, thus creating a condition as described in > above paragraph. Since {{more_rows}} is true, it will proceed to send next > request at which point the {{RSRpcServices}} will throw > {{UnknownScannerException}}. The protobuf client compatibility is maintained > but expected behavior is modified. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-11544) [Ergonomics] hbase.client.scanner.caching is dogged and will try to return batch even if it means OOME
[ https://issues.apache.org/jira/browse/HBASE-11544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16020554#comment-16020554 ] Karan Mehta commented on HBASE-11544: - [~jonathan.lawlor] Can the server return multiple partial rows? If yes, why does client side code assume that only the last result is partial in {{ClientScanner}} ? If not, why do we have a repeated boolean value for {{partial_flag_per_result}} in {{Client.protos}} ? > [Ergonomics] hbase.client.scanner.caching is dogged and will try to return > batch even if it means OOME > -- > > Key: HBASE-11544 > URL: https://issues.apache.org/jira/browse/HBASE-11544 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: Jonathan Lawlor >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: Allocation_Hot_Spots.html, gc.j.png, > HBASE-11544-addendum-v1.patch, HBASE-11544-addendum-v2.patch, > HBASE-11544-branch_1_0-v1.patch, HBASE-11544-branch_1_0-v2.patch, > HBASE-11544-v1.patch, HBASE-11544-v2.patch, HBASE-11544-v3.patch, > HBASE-11544-v4.patch, HBASE-11544-v5.patch, HBASE-11544-v6.patch, > HBASE-11544-v6.patch, HBASE-11544-v6.patch, HBASE-11544-v7.patch, > HBASE-11544-v8-branch-1.patch, HBASE-11544-v8.patch, hits.j.png, h.png, > mean.png, m.png, net.j.png, q (2).png > > > Running some tests, I set hbase.client.scanner.caching=1000. Dataset has > large cells. I kept OOME'ing. > Serverside, we should measure how much we've accumulated and return to the > client whatever we've gathered once we pass out a certain size threshold > rather than keep accumulating till we OOME. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HBASE-11544) [Ergonomics] hbase.client.scanner.caching is dogged and will try to return batch even if it means OOME
[ https://issues.apache.org/jira/browse/HBASE-11544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16020554#comment-16020554 ] Karan Mehta edited comment on HBASE-11544 at 5/23/17 8:42 PM: -- [~jonathan.lawlor] [~stack] Can the server return multiple partial rows? If yes, why does client side code assume that only the last result is partial in {{ClientScanner}} ? If not, why do we have a repeated boolean value for {{partial_flag_per_result}} in {{Client.protos}} ? was (Author: karanmehta93): [~jonathan.lawlor] Can the server return multiple partial rows? If yes, why does client side code assume that only the last result is partial in {{ClientScanner}} ? If not, why do we have a repeated boolean value for {{partial_flag_per_result}} in {{Client.protos}} ? > [Ergonomics] hbase.client.scanner.caching is dogged and will try to return > batch even if it means OOME > -- > > Key: HBASE-11544 > URL: https://issues.apache.org/jira/browse/HBASE-11544 > Project: HBase > Issue Type: Bug >Reporter: stack >Assignee: Jonathan Lawlor >Priority: Critical > Fix For: 2.0.0, 1.1.0 > > Attachments: Allocation_Hot_Spots.html, gc.j.png, > HBASE-11544-addendum-v1.patch, HBASE-11544-addendum-v2.patch, > HBASE-11544-branch_1_0-v1.patch, HBASE-11544-branch_1_0-v2.patch, > HBASE-11544-v1.patch, HBASE-11544-v2.patch, HBASE-11544-v3.patch, > HBASE-11544-v4.patch, HBASE-11544-v5.patch, HBASE-11544-v6.patch, > HBASE-11544-v6.patch, HBASE-11544-v6.patch, HBASE-11544-v7.patch, > HBASE-11544-v8-branch-1.patch, HBASE-11544-v8.patch, hits.j.png, h.png, > mean.png, m.png, net.j.png, q (2).png > > > Running some tests, I set hbase.client.scanner.caching=1000. Dataset has > large cells. I kept OOME'ing. > Serverside, we should measure how much we've accumulated and return to the > client whatever we've gathered once we pass out a certain size threshold > rather than keep accumulating till we OOME. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-18097) Client can save 1 RPC call for CloseScannerRequest
Karan Mehta created HBASE-18097: --- Summary: Client can save 1 RPC call for CloseScannerRequest Key: HBASE-18097 URL: https://issues.apache.org/jira/browse/HBASE-18097 Project: HBase Issue Type: Improvement Reporter: Karan Mehta Starting version 1.3, HBase automatically closes scanner on server side whenever the results are exhausted and corresponding bits are set in the {{ScanResponse}} proto returned to the client. We can use that info to eliminate the closeScanRequest RPC call, thereby saving 1 RPC per region per scan. This can be particularly useful for tables with more regions. Also, currently the {{ScanResponse}} proto sends out 1 bit per {{Result}} that it has embeds inside the {{CellScanner}} to indicate if it is partial or not. {code} // In every RPC response there should be at most a single partial result. Furthermore, if // there is a partial result, it is guaranteed to be in the last position of the array. {code} According to client, only the last result can be partial, thus this repeated bool can be converted to a bool, thus reducing overhead of serialization and deserialization of the array. This will break wire compatibility therefore this is something to look for in upcoming versions. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18042) Client Compatibility breaks between versions 1.2 and 1.3
[ https://issues.apache.org/jira/browse/HBASE-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16023226#comment-16023226 ] Karan Mehta commented on HBASE-18042: - [~Apache9] Could you please clarify your explanation for the bug, along with the versions you are referring as {{old client}}? Thanks > Client Compatibility breaks between versions 1.2 and 1.3 > > > Key: HBASE-18042 > URL: https://issues.apache.org/jira/browse/HBASE-18042 > Project: HBase > Issue Type: Bug > Components: regionserver, scan >Affects Versions: 2.0.0, 1.4.0, 1.3.1 >Reporter: Karan Mehta >Assignee: Duo Zhang >Priority: Critical > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18042-branch-1.patch, HBASE-18042-branch-1.patch, > HBASE-18042.patch, HBASE-18042-v1.patch > > > OpenTSDB uses AsyncHBase as its client, rather than using the traditional > HBase Client. From version 1.2 to 1.3, the {{ClientProtos}} have been > changed. Newer fields are added to {{ScanResponse}} proto. > For a typical Scan request in 1.2, would require caller to make an > OpenScanner Request, GetNextRows Request and a CloseScanner Request, based on > {{more_rows}} boolean field in the {{ScanResponse}} proto. > However, from 1.3, new parameter {{more_results_in_region}} was added, which > limits the results per region. Therefore the client has to now manage sending > all the requests for each region. Further more, if the results are exhausted > from a particular region, the {{ScanResponse}} will set > {{more_results_in_region}} to false, but {{more_results}} can still be true. > Whenever the former is set to false, the {{RegionScanner}} will also be > closed. > OpenTSDB makes an OpenScanner Request and receives all its results in the > first {{ScanResponse}} itself, thus creating a condition as described in > above paragraph. Since {{more_rows}} is true, it will proceed to send next > request at which point the {{RSRpcServices}} will throw > {{UnknownScannerException}}. The protobuf client compatibility is maintained > but expected behavior is modified. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18042) Client Compatibility breaks between versions 1.2 and 1.3
[ https://issues.apache.org/jira/browse/HBASE-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16024324#comment-16024324 ] Karan Mehta commented on HBASE-18042: - [~Apache9] Thanks for clarifying. Let me dig more into the code to understand it completely. I have also put up a patch for AsyncHBase after upgrading the protos from 0.98 to 1.3. Link: https://github.com/OpenTSDB/opentsdb/pull/990 With the patch it should work fine, but in general it is good to keep the logic consistent. > Client Compatibility breaks between versions 1.2 and 1.3 > > > Key: HBASE-18042 > URL: https://issues.apache.org/jira/browse/HBASE-18042 > Project: HBase > Issue Type: Bug > Components: regionserver, scan >Affects Versions: 2.0.0, 1.4.0, 1.3.1 >Reporter: Karan Mehta >Assignee: Duo Zhang >Priority: Critical > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18042-branch-1.patch, HBASE-18042-branch-1.patch, > HBASE-18042.patch, HBASE-18042-v1.patch, HBASE-18042-v2.patch > > > OpenTSDB uses AsyncHBase as its client, rather than using the traditional > HBase Client. From version 1.2 to 1.3, the {{ClientProtos}} have been > changed. Newer fields are added to {{ScanResponse}} proto. > For a typical Scan request in 1.2, would require caller to make an > OpenScanner Request, GetNextRows Request and a CloseScanner Request, based on > {{more_rows}} boolean field in the {{ScanResponse}} proto. > However, from 1.3, new parameter {{more_results_in_region}} was added, which > limits the results per region. Therefore the client has to now manage sending > all the requests for each region. Further more, if the results are exhausted > from a particular region, the {{ScanResponse}} will set > {{more_results_in_region}} to false, but {{more_results}} can still be true. > Whenever the former is set to false, the {{RegionScanner}} will also be > closed. > OpenTSDB makes an OpenScanner Request and receives all its results in the > first {{ScanResponse}} itself, thus creating a condition as described in > above paragraph. Since {{more_rows}} is true, it will proceed to send next > request at which point the {{RSRpcServices}} will throw > {{UnknownScannerException}}. The protobuf client compatibility is maintained > but expected behavior is modified. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-18117) Increase resiliency by allowing more parameters for online config change
Karan Mehta created HBASE-18117: --- Summary: Increase resiliency by allowing more parameters for online config change Key: HBASE-18117 URL: https://issues.apache.org/jira/browse/HBASE-18117 Project: HBase Issue Type: Improvement Reporter: Karan Mehta HBASE-8544 adds the feature to change config online without having a server restart. This JIRA is to work on new parameters for the utilizing that feature. As [~apurtell] suggested, following are the useful and frequently changing parameters in production. - RPC limits, timeouts, and other performance relevant settings - Replication limits and batch sizes - Region carrying limit - WAL retention and cleaning parameters I will try to make the RPC timeout parameter online as a part of this JIRA. If it seems suitable then we can extend it to other params. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HBASE-18117) Increase resiliency by allowing more parameters for online config change
[ https://issues.apache.org/jira/browse/HBASE-18117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta reassigned HBASE-18117: --- Assignee: Karan Mehta > Increase resiliency by allowing more parameters for online config change > > > Key: HBASE-18117 > URL: https://issues.apache.org/jira/browse/HBASE-18117 > Project: HBase > Issue Type: Improvement >Reporter: Karan Mehta >Assignee: Karan Mehta > > HBASE-8544 adds the feature to change config online without having a server > restart. This JIRA is to work on new parameters for the utilizing that > feature. > As [~apurtell] suggested, following are the useful and frequently changing > parameters in production. > - RPC limits, timeouts, and other performance relevant settings > - Replication limits and batch sizes > - Region carrying limit > - WAL retention and cleaning parameters > I will try to make the RPC timeout parameter online as a part of this JIRA. > If it seems suitable then we can extend it to other params. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18117) Increase resiliency by allowing more parameters for online config change
[ https://issues.apache.org/jira/browse/HBASE-18117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16025640#comment-16025640 ] Karan Mehta commented on HBASE-18117: - The current framework only allows changes on the config parameters that are accessed only on the server side. If {{ConfigurationObserver}} is implemented by any of the classes from {{hbase-client}}, this will introduce a cyclic dependency between hbase-client and hbase-server projects and thus the build would fail. > Increase resiliency by allowing more parameters for online config change > > > Key: HBASE-18117 > URL: https://issues.apache.org/jira/browse/HBASE-18117 > Project: HBase > Issue Type: Improvement >Reporter: Karan Mehta >Assignee: Karan Mehta > > HBASE-8544 adds the feature to change config online without having a server > restart. This JIRA is to work on new parameters for the utilizing that > feature. > As [~apurtell] suggested, following are the useful and frequently changing > parameters in production. > - RPC limits, timeouts, and other performance relevant settings > - Replication limits and batch sizes > - Region carrying limit > - WAL retention and cleaning parameters > I will try to make the RPC timeout parameter online as a part of this JIRA. > If it seems suitable then we can extend it to other params. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18117) Increase resiliency by allowing more parameters for online config change
[ https://issues.apache.org/jira/browse/HBASE-18117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16026518#comment-16026518 ] Karan Mehta commented on HBASE-18117: - {{ConfigurationManager}} manages all the observers and is meant to be a singleton class, which is initialized inside the {{RSRpcServices}}. However, it is declared as a package protected and hence it is difficult to make it useful for other parameters which are being used by classes from different packages. A better approach is to move this framework from {{hbase-server}} to {{hbase-common}}. How does this approach seem? This framework can follow singleton design pattern as well if required. > Increase resiliency by allowing more parameters for online config change > > > Key: HBASE-18117 > URL: https://issues.apache.org/jira/browse/HBASE-18117 > Project: HBase > Issue Type: Improvement >Reporter: Karan Mehta >Assignee: Karan Mehta > > HBASE-8544 adds the feature to change config online without having a server > restart. This JIRA is to work on new parameters for the utilizing that > feature. > As [~apurtell] suggested, following are the useful and frequently changing > parameters in production. > - RPC limits, timeouts, and other performance relevant settings > - Replication limits and batch sizes > - Region carrying limit > - WAL retention and cleaning parameters > I will try to make the RPC timeout parameter online as a part of this JIRA. > If it seems suitable then we can extend it to other params. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18117) Increase resiliency by allowing more parameters for online config change
[ https://issues.apache.org/jira/browse/HBASE-18117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16026864#comment-16026864 ] Karan Mehta commented on HBASE-18117: - Another potential issue is to ensure that future uses of the online parameter will implement the {{ConfigurationObserver}}. I couldn't find any such enforcement in the current framework. Could you please confirm? [~gaurav.menghani] > Increase resiliency by allowing more parameters for online config change > > > Key: HBASE-18117 > URL: https://issues.apache.org/jira/browse/HBASE-18117 > Project: HBase > Issue Type: Improvement >Reporter: Karan Mehta >Assignee: Karan Mehta > > HBASE-8544 adds the feature to change config online without having a server > restart. This JIRA is to work on new parameters for the utilizing that > feature. > As [~apurtell] suggested, following are the useful and frequently changing > parameters in production. > - RPC limits, timeouts, and other performance relevant settings > - Replication limits and batch sizes > - Region carrying limit > - WAL retention and cleaning parameters > I will try to make the RPC timeout parameter online as a part of this JIRA. > If it seems suitable then we can extend it to other params. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18097) Client can save 1 RPC call for CloseScannerRequest
[ https://issues.apache.org/jira/browse/HBASE-18097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16028663#comment-16028663 ] Karan Mehta commented on HBASE-18097: - The first part for saving 1 RPC request is already implemented as a part of HBASE-17508, where the scannerId is set to -1 whenever results are not left in region. For the second part related to ScanResponse proto, we can save some data in RPC call. [~Apache9] Do you feel it is reasonable to do so from the next version? > Client can save 1 RPC call for CloseScannerRequest > -- > > Key: HBASE-18097 > URL: https://issues.apache.org/jira/browse/HBASE-18097 > Project: HBase > Issue Type: Improvement >Reporter: Karan Mehta > > Starting version 1.3, HBase automatically closes scanner on server side > whenever the results are exhausted and corresponding bits are set in the > {{ScanResponse}} proto returned to the client. We can use that info to > eliminate the closeScanRequest RPC call, thereby saving 1 RPC per region per > scan. This can be particularly useful for tables with more regions. > Also, currently the {{ScanResponse}} proto sends out 1 bit per {{Result}} > that it has embeds inside the {{CellScanner}} to indicate if it is partial or > not. > {code} > // In every RPC response there should be at most a single partial result. > Furthermore, if > // there is a partial result, it is guaranteed to be in the last position > of the array. > {code} > According to client, only the last result can be partial, thus this repeated > bool can be converted to a bool, thus reducing overhead of serialization and > deserialization of the array. This will break wire compatibility therefore > this is something to look for in upcoming versions. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18097) Client can save 1 RPC call for CloseScannerRequest
[ https://issues.apache.org/jira/browse/HBASE-18097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta updated HBASE-18097: Affects Version/s: 1.3.2 > Client can save 1 RPC call for CloseScannerRequest > -- > > Key: HBASE-18097 > URL: https://issues.apache.org/jira/browse/HBASE-18097 > Project: HBase > Issue Type: Improvement >Affects Versions: 1.3.2 >Reporter: Karan Mehta > > Starting version 1.3, HBase automatically closes scanner on server side > whenever the results are exhausted and corresponding bits are set in the > {{ScanResponse}} proto returned to the client. We can use that info to > eliminate the closeScanRequest RPC call, thereby saving 1 RPC per region per > scan. This can be particularly useful for tables with more regions. > Also, currently the {{ScanResponse}} proto sends out 1 bit per {{Result}} > that it has embeds inside the {{CellScanner}} to indicate if it is partial or > not. > {code} > // In every RPC response there should be at most a single partial result. > Furthermore, if > // there is a partial result, it is guaranteed to be in the last position > of the array. > {code} > According to client, only the last result can be partial, thus this repeated > bool can be converted to a bool, thus reducing overhead of serialization and > deserialization of the array. This will break wire compatibility therefore > this is something to look for in upcoming versions. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18097) Save bandwidth on partial_flag_per_result in ScanResponse proto
[ https://issues.apache.org/jira/browse/HBASE-18097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta updated HBASE-18097: Summary: Save bandwidth on partial_flag_per_result in ScanResponse proto (was: Client can save 1 RPC call for CloseScannerRequest) > Save bandwidth on partial_flag_per_result in ScanResponse proto > --- > > Key: HBASE-18097 > URL: https://issues.apache.org/jira/browse/HBASE-18097 > Project: HBase > Issue Type: Improvement >Affects Versions: 1.3.2 >Reporter: Karan Mehta > > Starting version 1.3, HBase automatically closes scanner on server side > whenever the results are exhausted and corresponding bits are set in the > {{ScanResponse}} proto returned to the client. We can use that info to > eliminate the closeScanRequest RPC call, thereby saving 1 RPC per region per > scan. This can be particularly useful for tables with more regions. > Also, currently the {{ScanResponse}} proto sends out 1 bit per {{Result}} > that it has embeds inside the {{CellScanner}} to indicate if it is partial or > not. > {code} > // In every RPC response there should be at most a single partial result. > Furthermore, if > // there is a partial result, it is guaranteed to be in the last position > of the array. > {code} > According to client, only the last result can be partial, thus this repeated > bool can be converted to a bool, thus reducing overhead of serialization and > deserialization of the array. This will break wire compatibility therefore > this is something to look for in upcoming versions. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18097) Save bandwidth on partial_flag_per_result in ScanResponse proto
[ https://issues.apache.org/jira/browse/HBASE-18097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta updated HBASE-18097: Description: Currently the {{ScanResponse}} proto sends out 1 bit per {{Result}} that it has embeds inside the {{CellScanner}} to indicate if it is partial or not. {code} // In every RPC response there should be at most a single partial result. Furthermore, if // there is a partial result, it is guaranteed to be in the last position of the array. {code} According to client, only the last result can be partial, thus this repeated bool can be converted to a bool, thus reducing overhead of serialization and deserialization of the array. This will break wire compatibility therefore this is something to look for in upcoming versions. was: Starting version 1.3, HBase automatically closes scanner on server side whenever the results are exhausted and corresponding bits are set in the {{ScanResponse}} proto returned to the client. We can use that info to eliminate the closeScanRequest RPC call, thereby saving 1 RPC per region per scan. This can be particularly useful for tables with more regions. Also, currently the {{ScanResponse}} proto sends out 1 bit per {{Result}} that it has embeds inside the {{CellScanner}} to indicate if it is partial or not. {code} // In every RPC response there should be at most a single partial result. Furthermore, if // there is a partial result, it is guaranteed to be in the last position of the array. {code} According to client, only the last result can be partial, thus this repeated bool can be converted to a bool, thus reducing overhead of serialization and deserialization of the array. This will break wire compatibility therefore this is something to look for in upcoming versions. > Save bandwidth on partial_flag_per_result in ScanResponse proto > --- > > Key: HBASE-18097 > URL: https://issues.apache.org/jira/browse/HBASE-18097 > Project: HBase > Issue Type: Improvement >Affects Versions: 1.3.2 >Reporter: Karan Mehta > > Currently the {{ScanResponse}} proto sends out 1 bit per {{Result}} that it > has embeds inside the {{CellScanner}} to indicate if it is partial or not. > {code} > // In every RPC response there should be at most a single partial result. > Furthermore, if > // there is a partial result, it is guaranteed to be in the last position > of the array. > {code} > According to client, only the last result can be partial, thus this repeated > bool can be converted to a bool, thus reducing overhead of serialization and > deserialization of the array. This will break wire compatibility therefore > this is something to look for in upcoming versions. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18097) Save bandwidth on partial_flag_per_result in ScanResponse proto
[ https://issues.apache.org/jira/browse/HBASE-18097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta updated HBASE-18097: Affects Version/s: (was: 1.3.2) 1.4.0 2.0.0 > Save bandwidth on partial_flag_per_result in ScanResponse proto > --- > > Key: HBASE-18097 > URL: https://issues.apache.org/jira/browse/HBASE-18097 > Project: HBase > Issue Type: Improvement >Affects Versions: 2.0.0, 1.4.0 >Reporter: Karan Mehta > > Currently the {{ScanResponse}} proto sends out 1 bit per {{Result}} that it > has embeds inside the {{CellScanner}} to indicate if it is partial or not. > {code} > // In every RPC response there should be at most a single partial result. > Furthermore, if > // there is a partial result, it is guaranteed to be in the last position > of the array. > {code} > According to client, only the last result can be partial, thus this repeated > bool can be converted to a bool, thus reducing overhead of serialization and > deserialization of the array. This will break wire compatibility therefore > this is something to look for in upcoming versions. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HBASE-18097) Save bandwidth on partial_flag_per_result in ScanResponse proto
[ https://issues.apache.org/jira/browse/HBASE-18097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta reassigned HBASE-18097: --- Assignee: Karan Mehta > Save bandwidth on partial_flag_per_result in ScanResponse proto > --- > > Key: HBASE-18097 > URL: https://issues.apache.org/jira/browse/HBASE-18097 > Project: HBase > Issue Type: Improvement >Affects Versions: 2.0.0, 1.4.0 >Reporter: Karan Mehta >Assignee: Karan Mehta > > Currently the {{ScanResponse}} proto sends out 1 bit per {{Result}} that it > has embeds inside the {{CellScanner}} to indicate if it is partial or not. > {code} > // In every RPC response there should be at most a single partial result. > Furthermore, if > // there is a partial result, it is guaranteed to be in the last position > of the array. > {code} > According to client, only the last result can be partial, thus this repeated > bool can be converted to a bool, thus reducing overhead of serialization and > deserialization of the array. This will break wire compatibility therefore > this is something to look for in upcoming versions. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18097) Save bandwidth on partial_flag_per_result in ScanResponse proto
[ https://issues.apache.org/jira/browse/HBASE-18097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16036448#comment-16036448 ] Karan Mehta commented on HBASE-18097: - The problem can occur if the client wants results in a specified batch size, in which case, the results can contain multiple partial results, which is then left to the user to handle appropriately, based on the {{partial}} flag inside the result. This is usually the case with AsyncHBaseClient. > Save bandwidth on partial_flag_per_result in ScanResponse proto > --- > > Key: HBASE-18097 > URL: https://issues.apache.org/jira/browse/HBASE-18097 > Project: HBase > Issue Type: Improvement >Affects Versions: 2.0.0, 1.4.0 >Reporter: Karan Mehta >Assignee: Karan Mehta > > Currently the {{ScanResponse}} proto sends out 1 bit per {{Result}} that it > has embeds inside the {{CellScanner}} to indicate if it is partial or not. > {code} > // In every RPC response there should be at most a single partial result. > Furthermore, if > // there is a partial result, it is guaranteed to be in the last position > of the array. > {code} > According to client, only the last result can be partial, thus this repeated > bool can be converted to a bool, thus reducing overhead of serialization and > deserialization of the array. This will break wire compatibility therefore > this is something to look for in upcoming versions. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HBASE-18097) Save bandwidth on partial_flag_per_result in ScanResponse proto
[ https://issues.apache.org/jira/browse/HBASE-18097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16036448#comment-16036448 ] Karan Mehta edited comment on HBASE-18097 at 6/7/17 6:06 PM: - The problem can occur if the client wants results in a specified batch size, in which case, the results can contain multiple partial results, which is then left to the user to handle appropriately, based on the {{partial}} flag inside the result. This is usually the case with AsyncHBaseClient. Any suggestions, [~enis] or [~Apache9] ? was (Author: karanmehta93): The problem can occur if the client wants results in a specified batch size, in which case, the results can contain multiple partial results, which is then left to the user to handle appropriately, based on the {{partial}} flag inside the result. This is usually the case with AsyncHBaseClient. > Save bandwidth on partial_flag_per_result in ScanResponse proto > --- > > Key: HBASE-18097 > URL: https://issues.apache.org/jira/browse/HBASE-18097 > Project: HBase > Issue Type: Improvement >Affects Versions: 2.0.0, 1.4.0 >Reporter: Karan Mehta >Assignee: Karan Mehta > > Currently the {{ScanResponse}} proto sends out 1 bit per {{Result}} that it > has embeds inside the {{CellScanner}} to indicate if it is partial or not. > {code} > // In every RPC response there should be at most a single partial result. > Furthermore, if > // there is a partial result, it is guaranteed to be in the last position > of the array. > {code} > According to client, only the last result can be partial, thus this repeated > bool can be converted to a bool, thus reducing overhead of serialization and > deserialization of the array. This will break wire compatibility therefore > this is something to look for in upcoming versions. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18228) HBCK improvements
[ https://issues.apache.org/jira/browse/HBASE-18228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054461#comment-16054461 ] Karan Mehta commented on HBASE-18228: - [~lhofhansl] [~apurtell] I can take it up. > HBCK improvements > - > > Key: HBASE-18228 > URL: https://issues.apache.org/jira/browse/HBASE-18228 > Project: HBase > Issue Type: Improvement > Components: hbck >Reporter: Lars Hofhansl >Priority: Critical > Fix For: 1.4.0 > > > We just had a prod issue and running HBCK the way we did actually causes more > problems. > In part HBCK did stuff we did not expect, in part we had little visibility > into what HBCK was doing, and in part the logging was confusing. > I'm proposing 2 improvements: > 1. A dry-run mode. Run, and just list what would have been done. > 2. An interactive mode. Run, and for each action request Y/N user input. So > that a user can opt-out of stuff. > [~jmhsieh], FYI -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HBASE-18228) HBCK improvements
[ https://issues.apache.org/jira/browse/HBASE-18228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta reassigned HBASE-18228: --- Assignee: Karan Mehta > HBCK improvements > - > > Key: HBASE-18228 > URL: https://issues.apache.org/jira/browse/HBASE-18228 > Project: HBase > Issue Type: Improvement > Components: hbck >Reporter: Lars Hofhansl >Assignee: Karan Mehta >Priority: Critical > Fix For: 1.4.0 > > > We just had a prod issue and running HBCK the way we did actually causes more > problems. > In part HBCK did stuff we did not expect, in part we had little visibility > into what HBCK was doing, and in part the logging was confusing. > I'm proposing 2 improvements: > 1. A dry-run mode. Run, and just list what would have been done. > 2. An interactive mode. Run, and for each action request Y/N user input. So > that a user can opt-out of stuff. > [~jmhsieh], FYI -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18228) HBCK improvements
[ https://issues.apache.org/jira/browse/HBASE-18228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063613#comment-16063613 ] Karan Mehta commented on HBASE-18228: - What should be the granularity of the operation? For example, running -fixAssigments or -fixHoles on a table, would run certain steps for the all the regions. Do we need to ask the user for individual step confirmation or for the command as a hole? The pros are - More granularity, more power / flexibility to the user The cons are - Lot of questions / decisions for user if the table has large number of regions - Hbck will run in parallel for every regionserver. The messages will be intermingled. - User might accidentally leave cluster in unhealthy state. For example, if the user decides to fix certain holes vs not fixing some of them in meta. The alternate option is to get user confirmation before every major step, which would help if switches like -repair is used, which internally performs bunch of other steps. [~jmhsieh] [~lhofhansl] [~apurtell] [~churromorales] Please suggest. > HBCK improvements > - > > Key: HBASE-18228 > URL: https://issues.apache.org/jira/browse/HBASE-18228 > Project: HBase > Issue Type: Improvement > Components: hbck >Reporter: Lars Hofhansl >Assignee: Karan Mehta >Priority: Critical > Fix For: 1.4.0 > > > We just had a prod issue and running HBCK the way we did actually causes more > problems. > In part HBCK did stuff we did not expect, in part we had little visibility > into what HBCK was doing, and in part the logging was confusing. > I'm proposing 2 improvements: > 1. A dry-run mode. Run, and just list what would have been done. > 2. An interactive mode. Run, and for each action request Y/N user input. So > that a user can opt-out of stuff. > [~jmhsieh], FYI -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HBASE-18228) HBCK improvements
[ https://issues.apache.org/jira/browse/HBASE-18228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063613#comment-16063613 ] Karan Mehta edited comment on HBASE-18228 at 6/26/17 7:21 PM: -- What should be the granularity of the operation? For example, running -fixAssigments or -fixHoles on a table, would run certain steps for the all the regions. Do we need to ask the user for individual step confirmation or for the command as a hole? The pros are * More granularity, more power / flexibility to the user The cons are * Lot of questions / decisions for user if the table has large number of regions * Hbck will run in parallel for every regionserver. The messages will be intermingled. * User might accidentally leave cluster in unhealthy state. For example, if the user decides to fix certain holes vs not fixing some of them in meta. * The alternate option is to get user confirmation before every major step, which would help if switches like -repair is used, which internally performs bunch of other steps. [~jmhsieh] [~lhofhansl] [~apurtell] [~churromorales] Please suggest. was (Author: karanmehta93): What should be the granularity of the operation? For example, running -fixAssigments or -fixHoles on a table, would run certain steps for the all the regions. Do we need to ask the user for individual step confirmation or for the command as a hole? The pros are - More granularity, more power / flexibility to the user The cons are - Lot of questions / decisions for user if the table has large number of regions - Hbck will run in parallel for every regionserver. The messages will be intermingled. - User might accidentally leave cluster in unhealthy state. For example, if the user decides to fix certain holes vs not fixing some of them in meta. The alternate option is to get user confirmation before every major step, which would help if switches like -repair is used, which internally performs bunch of other steps. [~jmhsieh] [~lhofhansl] [~apurtell] [~churromorales] Please suggest. > HBCK improvements > - > > Key: HBASE-18228 > URL: https://issues.apache.org/jira/browse/HBASE-18228 > Project: HBase > Issue Type: Improvement > Components: hbck >Reporter: Lars Hofhansl >Assignee: Karan Mehta >Priority: Critical > Fix For: 1.4.0 > > > We just had a prod issue and running HBCK the way we did actually causes more > problems. > In part HBCK did stuff we did not expect, in part we had little visibility > into what HBCK was doing, and in part the logging was confusing. > I'm proposing 2 improvements: > 1. A dry-run mode. Run, and just list what would have been done. > 2. An interactive mode. Run, and for each action request Y/N user input. So > that a user can opt-out of stuff. > [~jmhsieh], FYI -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HBASE-18228) HBCK improvements
[ https://issues.apache.org/jira/browse/HBASE-18228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063613#comment-16063613 ] Karan Mehta edited comment on HBASE-18228 at 6/26/17 7:22 PM: -- What should be the granularity of the operation? For example, running -fixAssigments or -fixHoles on a table, would run certain steps for the all the regions. Do we need to ask the user for individual step confirmation or for the command as a hole? The pros are * More granularity, more power / flexibility to the user The cons are * Lot of questions / decisions for user if the table has large number of regions * Hbck will run in parallel for every regionserver. The messages will be intermingled. * User might accidentally leave cluster in unhealthy state. For example, if the user decides to fix certain holes vs not fixing some of them in meta. The alternate option is to get user confirmation before every major step, which would help if switches like -repair is used, which internally performs bunch of other steps. [~jmhsieh] [~lhofhansl] [~apurtell] [~churromorales] Please suggest. was (Author: karanmehta93): What should be the granularity of the operation? For example, running -fixAssigments or -fixHoles on a table, would run certain steps for the all the regions. Do we need to ask the user for individual step confirmation or for the command as a hole? The pros are * More granularity, more power / flexibility to the user The cons are * Lot of questions / decisions for user if the table has large number of regions * Hbck will run in parallel for every regionserver. The messages will be intermingled. * User might accidentally leave cluster in unhealthy state. For example, if the user decides to fix certain holes vs not fixing some of them in meta. * The alternate option is to get user confirmation before every major step, which would help if switches like -repair is used, which internally performs bunch of other steps. [~jmhsieh] [~lhofhansl] [~apurtell] [~churromorales] Please suggest. > HBCK improvements > - > > Key: HBASE-18228 > URL: https://issues.apache.org/jira/browse/HBASE-18228 > Project: HBase > Issue Type: Improvement > Components: hbck >Reporter: Lars Hofhansl >Assignee: Karan Mehta >Priority: Critical > Fix For: 1.4.0 > > > We just had a prod issue and running HBCK the way we did actually causes more > problems. > In part HBCK did stuff we did not expect, in part we had little visibility > into what HBCK was doing, and in part the logging was confusing. > I'm proposing 2 improvements: > 1. A dry-run mode. Run, and just list what would have been done. > 2. An interactive mode. Run, and for each action request Y/N user input. So > that a user can opt-out of stuff. > [~jmhsieh], FYI -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-18228) HBCK improvements
[ https://issues.apache.org/jira/browse/HBASE-18228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta updated HBASE-18228: Status: Patch Available (was: Open) > HBCK improvements > - > > Key: HBASE-18228 > URL: https://issues.apache.org/jira/browse/HBASE-18228 > Project: HBase > Issue Type: Improvement > Components: hbck >Reporter: Lars Hofhansl >Assignee: Karan Mehta >Priority: Critical > Fix For: 1.4.0 > > Attachments: HBASE-18228.branch-1.3.patch > > > We just had a prod issue and running HBCK the way we did actually causes more > problems. > In part HBCK did stuff we did not expect, in part we had little visibility > into what HBCK was doing, and in part the logging was confusing. > I'm proposing 2 improvements: > 1. A dry-run mode. Run, and just list what would have been done. > 2. An interactive mode. Run, and for each action request Y/N user input. So > that a user can opt-out of stuff. > [~jmhsieh], FYI -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HBASE-18228) HBCK improvements
[ https://issues.apache.org/jira/browse/HBASE-18228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta updated HBASE-18228: Attachment: HBASE-18228.branch-1.3.patch Patch includes the changes: New switches for hbck * {{-dryRun}} --> Runs HBCK without affecting anything. Also prints out what potential changes will this particular run make. It cannot output full detail since some operations can only be performed after the first one is done. * {{-i}} --> Interactive HBCK. Asks for user input before every potential modification. For example, before fixing a particular hole in META, creating a new .regionInfo file etc. Also asks user confirmation for options such as -repair and -repairHoles which internally run several other switches. [~apurtell] [~jmhsieh] [~mdrob] Please review. > HBCK improvements > - > > Key: HBASE-18228 > URL: https://issues.apache.org/jira/browse/HBASE-18228 > Project: HBase > Issue Type: Improvement > Components: hbck >Reporter: Lars Hofhansl >Assignee: Karan Mehta >Priority: Critical > Fix For: 1.4.0 > > Attachments: HBASE-18228.branch-1.3.patch > > > We just had a prod issue and running HBCK the way we did actually causes more > problems. > In part HBCK did stuff we did not expect, in part we had little visibility > into what HBCK was doing, and in part the logging was confusing. > I'm proposing 2 improvements: > 1. A dry-run mode. Run, and just list what would have been done. > 2. An interactive mode. Run, and for each action request Y/N user input. So > that a user can opt-out of stuff. > [~jmhsieh], FYI -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18228) HBCK improvements
[ https://issues.apache.org/jira/browse/HBASE-18228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070747#comment-16070747 ] Karan Mehta commented on HBASE-18228: - bq. Also, curious, why branch 1.3 specifically? I will submit a patch for branch-1 as well. It's just that I started working with this branch and the scope is limited to 1.4 anyways for this patch. [~mdrob] Added the review board link. > HBCK improvements > - > > Key: HBASE-18228 > URL: https://issues.apache.org/jira/browse/HBASE-18228 > Project: HBase > Issue Type: Improvement > Components: hbck >Reporter: Lars Hofhansl >Assignee: Karan Mehta >Priority: Critical > Fix For: 1.4.0 > > Attachments: HBASE-18228.branch-1.3.patch > > > We just had a prod issue and running HBCK the way we did actually causes more > problems. > In part HBCK did stuff we did not expect, in part we had little visibility > into what HBCK was doing, and in part the logging was confusing. > I'm proposing 2 improvements: > 1. A dry-run mode. Run, and just list what would have been done. > 2. An interactive mode. Run, and for each action request Y/N user input. So > that a user can opt-out of stuff. > [~jmhsieh], FYI -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HBASE-18228) HBCK improvements
[ https://issues.apache.org/jira/browse/HBASE-18228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16076809#comment-16076809 ] Karan Mehta edited comment on HBASE-18228 at 7/6/17 4:08 PM: - [~mdrob] [~te...@apache.org] Can you provide some feedback on my reply on review board? was (Author: karanmehta93): [~mdrob] Can you provide some feedback on my reply on review board? > HBCK improvements > - > > Key: HBASE-18228 > URL: https://issues.apache.org/jira/browse/HBASE-18228 > Project: HBase > Issue Type: Improvement > Components: hbck >Reporter: Lars Hofhansl >Assignee: Karan Mehta >Priority: Critical > Fix For: 1.4.0 > > Attachments: HBASE-18228.branch-1.3.patch > > > We just had a prod issue and running HBCK the way we did actually causes more > problems. > In part HBCK did stuff we did not expect, in part we had little visibility > into what HBCK was doing, and in part the logging was confusing. > I'm proposing 2 improvements: > 1. A dry-run mode. Run, and just list what would have been done. > 2. An interactive mode. Run, and for each action request Y/N user input. So > that a user can opt-out of stuff. > [~jmhsieh], FYI -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18228) HBCK improvements
[ https://issues.apache.org/jira/browse/HBASE-18228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16076809#comment-16076809 ] Karan Mehta commented on HBASE-18228: - [~mdrob] Can you provide some feedback on my reply on review board? > HBCK improvements > - > > Key: HBASE-18228 > URL: https://issues.apache.org/jira/browse/HBASE-18228 > Project: HBase > Issue Type: Improvement > Components: hbck >Reporter: Lars Hofhansl >Assignee: Karan Mehta >Priority: Critical > Fix For: 1.4.0 > > Attachments: HBASE-18228.branch-1.3.patch > > > We just had a prod issue and running HBCK the way we did actually causes more > problems. > In part HBCK did stuff we did not expect, in part we had little visibility > into what HBCK was doing, and in part the logging was confusing. > I'm proposing 2 improvements: > 1. A dry-run mode. Run, and just list what would have been done. > 2. An interactive mode. Run, and for each action request Y/N user input. So > that a user can opt-out of stuff. > [~jmhsieh], FYI -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HBASE-18396) Encode ZNode names to reduce ZooKeeper jute buffer length requirements and thus reduce memory usage
Karan Mehta created HBASE-18396: --- Summary: Encode ZNode names to reduce ZooKeeper jute buffer length requirements and thus reduce memory usage Key: HBASE-18396 URL: https://issues.apache.org/jira/browse/HBASE-18396 Project: HBase Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Karan Mehta In our production environment, we hit the error {{ZooKeeper connectionLoss due to jute.maxbuffer len of 1M getting exceeded}}. Usually 1 MB is a lot, but in case of multi requests, it can exceed the maximum buffer length that is allocated. This JIRA is a discussion for encoding various znode names. IMO, this will reduce the path lengths, thus reducing the size of buffer required as well as network packet size and also pack more requests in a single multi. As with encoding, this will introduce overhead, but we need to determine how feasible this idea is. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18396) Encode ZNode names to reduce ZooKeeper jute buffer length requirements and thus reduce memory usage
[ https://issues.apache.org/jira/browse/HBASE-18396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16118956#comment-16118956 ] Karan Mehta commented on HBASE-18396: - [~mdrob] Could you please elaborate? > Encode ZNode names to reduce ZooKeeper jute buffer length requirements and > thus reduce memory usage > --- > > Key: HBASE-18396 > URL: https://issues.apache.org/jira/browse/HBASE-18396 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Karan Mehta > > In our production environment, we hit the error {{ZooKeeper connectionLoss > due to jute.maxbuffer len of 1M getting exceeded}}. Usually 1 MB is a lot, > but in case of multi requests, it can exceed the maximum buffer length that > is allocated. > This JIRA is a discussion for encoding various znode names. IMO, this will > reduce the path lengths, thus reducing the size of buffer required as well as > network packet size and also pack more requests in a single multi. As with > encoding, this will introduce overhead, but we need to determine how feasible > this idea is. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18228) HBCK improvements
[ https://issues.apache.org/jira/browse/HBASE-18228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16389950#comment-16389950 ] Karan Mehta commented on HBASE-18228: - [~lhofhansl] A patch was attempted for this JIRA. However it doesn't seem as useful as expected. Would you mind to discuss other potential improvements here? FYI, this issue is specifically addressing branch-1.3 (possibly 1.4). > HBCK improvements > - > > Key: HBASE-18228 > URL: https://issues.apache.org/jira/browse/HBASE-18228 > Project: HBase > Issue Type: Improvement > Components: hbck >Reporter: Lars Hofhansl >Assignee: Karan Mehta >Priority: Critical > Fix For: 1.5.0, 1.4.3 > > Attachments: HBASE-18228.branch-1.3.patch > > > We just had a prod issue and running HBCK the way we did actually causes more > problems. > In part HBCK did stuff we did not expect, in part we had little visibility > into what HBCK was doing, and in part the logging was confusing. > I'm proposing 2 improvements: > 1. A dry-run mode. Run, and just list what would have been done. > 2. An interactive mode. Run, and for each action request Y/N user input. So > that a user can opt-out of stuff. > [~jmhsieh], FYI -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-18097) Save bandwidth on partial_flag_per_result in ScanResponse proto
[ https://issues.apache.org/jira/browse/HBASE-18097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242832#comment-16242832 ] Karan Mehta commented on HBASE-18097: - Ping [~enis] [~Apache9] Any thoughts or suggestions? Is the improvement worth it? > Save bandwidth on partial_flag_per_result in ScanResponse proto > --- > > Key: HBASE-18097 > URL: https://issues.apache.org/jira/browse/HBASE-18097 > Project: HBase > Issue Type: Improvement >Affects Versions: 2.0.0, 3.0.0, 1.5.0 >Reporter: Karan Mehta >Assignee: Karan Mehta > > Currently the {{ScanResponse}} proto sends out 1 bit per {{Result}} that it > has embeds inside the {{CellScanner}} to indicate if it is partial or not. > {code} > // In every RPC response there should be at most a single partial result. > Furthermore, if > // there is a partial result, it is guaranteed to be in the last position > of the array. > {code} > According to client, only the last result can be partial, thus this repeated > bool can be converted to a bool, thus reducing overhead of serialization and > deserialization of the array. This will break wire compatibility therefore > this is something to look for in upcoming versions. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HBASE-21553) schedLock not released ni MasterProcedureScheduler
[ https://issues.apache.org/jira/browse/HBASE-21553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta reassigned HBASE-21553: --- Assignee: Karan Mehta > schedLock not released ni MasterProcedureScheduler > -- > > Key: HBASE-21553 > URL: https://issues.apache.org/jira/browse/HBASE-21553 > Project: HBase > Issue Type: Improvement >Reporter: Xu Cang >Assignee: Karan Mehta >Priority: Major > > https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L749 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21553) schedLock not released in MasterProcedureScheduler
[ https://issues.apache.org/jira/browse/HBASE-21553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta updated HBASE-21553: Summary: schedLock not released in MasterProcedureScheduler (was: schedLock not released ni MasterProcedureScheduler) > schedLock not released in MasterProcedureScheduler > -- > > Key: HBASE-21553 > URL: https://issues.apache.org/jira/browse/HBASE-21553 > Project: HBase > Issue Type: Improvement >Reporter: Xu Cang >Priority: Major > > https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L749 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HBASE-21553) schedLock not released ni MasterProcedureScheduler
[ https://issues.apache.org/jira/browse/HBASE-21553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta reassigned HBASE-21553: --- Assignee: (was: Karan Mehta) > schedLock not released ni MasterProcedureScheduler > -- > > Key: HBASE-21553 > URL: https://issues.apache.org/jira/browse/HBASE-21553 > Project: HBase > Issue Type: Improvement >Reporter: Xu Cang >Priority: Major > > https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L749 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HBASE-21553) schedLock not released ni MasterProcedureScheduler
[ https://issues.apache.org/jira/browse/HBASE-21553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta reassigned HBASE-21553: --- Assignee: Karan Mehta > schedLock not released ni MasterProcedureScheduler > -- > > Key: HBASE-21553 > URL: https://issues.apache.org/jira/browse/HBASE-21553 > Project: HBase > Issue Type: Improvement >Reporter: Xu Cang >Assignee: Karan Mehta >Priority: Major > > https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L749 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HBASE-21553) schedLock not released ni MasterProcedureScheduler
[ https://issues.apache.org/jira/browse/HBASE-21553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta reassigned HBASE-21553: --- Assignee: (was: Karan Mehta) > schedLock not released ni MasterProcedureScheduler > -- > > Key: HBASE-21553 > URL: https://issues.apache.org/jira/browse/HBASE-21553 > Project: HBase > Issue Type: Improvement >Reporter: Xu Cang >Priority: Major > > https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L749 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21553) schedLock not released in MasterProcedureScheduler
[ https://issues.apache.org/jira/browse/HBASE-21553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16710977#comment-16710977 ] Karan Mehta commented on HBASE-21553: - Good Finding [~xucang]!! FYI [~sukumaddineni] [~swaroopa] This is probably the root cause of stuck procedures in the cluster. > schedLock not released in MasterProcedureScheduler > -- > > Key: HBASE-21553 > URL: https://issues.apache.org/jira/browse/HBASE-21553 > Project: HBase > Issue Type: Improvement >Reporter: Xu Cang >Priority: Major > > https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L749 > As shown above, we didn't unlock schedLock which can cause deadlock. > Besides this, there are other places in this class handles schedLock.unlock > in a risky manner. I'd like to move them to finally block to improve the > robustness of handling locks. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21553) schedLock not released in MasterProcedureScheduler
[ https://issues.apache.org/jira/browse/HBASE-21553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16715995#comment-16715995 ] Karan Mehta commented on HBASE-21553: - Is this not going into branch-1.3 or branch-1.2? > schedLock not released in MasterProcedureScheduler > -- > > Key: HBASE-21553 > URL: https://issues.apache.org/jira/browse/HBASE-21553 > Project: HBase > Issue Type: Improvement >Reporter: Xu Cang >Assignee: Xu Cang >Priority: Major > Fix For: 1.5.0, 1.4.10 > > Attachments: HBASE-21553-branch-1.001.patch, > HBASE-21553-branch-1.002.patch > > > https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L749 > As shown above, we didn't unlock schedLock which can cause deadlock. > Besides this, there are other places in this class handles schedLock.unlock > in a risky manner. I'd like to move them to finally block to improve the > robustness of handling locks. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (HBASE-18228) HBCK improvements
[ https://issues.apache.org/jira/browse/HBASE-18228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karan Mehta reassigned HBASE-18228: --- Assignee: (was: Karan Mehta) > HBCK improvements > - > > Key: HBASE-18228 > URL: https://issues.apache.org/jira/browse/HBASE-18228 > Project: HBase > Issue Type: Improvement > Components: hbck >Reporter: Lars Hofhansl >Priority: Critical > Fix For: 3.0.0 > > Attachments: HBASE-18228.branch-1.3.patch > > > We just had a prod issue and running HBCK the way we did actually causes more > problems. > In part HBCK did stuff we did not expect, in part we had little visibility > into what HBCK was doing, and in part the logging was confusing. > I'm proposing 2 improvements: > 1. A dry-run mode. Run, and just list what would have been done. > 2. An interactive mode. Run, and for each action request Y/N user input. So > that a user can opt-out of stuff. > [~jmhsieh], FYI -- This message was sent by Atlassian JIRA (v7.6.3#76005)