[jira] [Created] (HBASE-17716) Formalize Scan Metric names

2017-03-01 Thread Karan Mehta (JIRA)
Karan Mehta created HBASE-17716:
---

 Summary: Formalize Scan Metric names
 Key: HBASE-17716
 URL: https://issues.apache.org/jira/browse/HBASE-17716
 Project: HBase
  Issue Type: Bug
  Components: metrics
Reporter: Karan Mehta
Assignee: Karan Mehta
Priority: Minor


HBase provides various metrics through the API's exposed by ScanMetrics class. 
The JIRA PHOENIX-3248 requires them to be surfaced through the Phoenix Metrics 
API. Currently these metrics are referred via hard-coded strings, which are not 
formal and can break the Phoenix API. Hence we need to refactor the code to 
assign enums for these metrics.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17716) Formalize Scan Metric names

2017-03-01 Thread Karan Mehta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karan Mehta updated HBASE-17716:

Attachment: HBASE-17716.patch

> Formalize Scan Metric names
> ---
>
> Key: HBASE-17716
> URL: https://issues.apache.org/jira/browse/HBASE-17716
> Project: HBase
>  Issue Type: Bug
>  Components: metrics
>Reporter: Karan Mehta
>Assignee: Karan Mehta
>Priority: Minor
> Attachments: HBASE-17716.patch
>
>
> HBase provides various metrics through the API's exposed by ScanMetrics 
> class. 
> The JIRA PHOENIX-3248 requires them to be surfaced through the Phoenix 
> Metrics API. Currently these metrics are referred via hard-coded strings, 
> which are not formal and can break the Phoenix API. Hence we need to refactor 
> the code to assign enums for these metrics.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17716) Formalize Scan Metric names

2017-03-01 Thread Karan Mehta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karan Mehta updated HBASE-17716:

Status: Patch Available  (was: Open)

> Formalize Scan Metric names
> ---
>
> Key: HBASE-17716
> URL: https://issues.apache.org/jira/browse/HBASE-17716
> Project: HBase
>  Issue Type: Bug
>  Components: metrics
>Reporter: Karan Mehta
>Assignee: Karan Mehta
>Priority: Minor
> Attachments: HBASE-17716.patch
>
>
> HBase provides various metrics through the API's exposed by ScanMetrics 
> class. 
> The JIRA PHOENIX-3248 requires them to be surfaced through the Phoenix 
> Metrics API. Currently these metrics are referred via hard-coded strings, 
> which are not formal and can break the Phoenix API. Hence we need to refactor 
> the code to assign enums for these metrics.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17698) ReplicationEndpoint choosing sinks

2017-03-06 Thread Karan Mehta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karan Mehta updated HBASE-17698:

Attachment: HBASE-17698.patch

> ReplicationEndpoint choosing sinks
> --
>
> Key: HBASE-17698
> URL: https://issues.apache.org/jira/browse/HBASE-17698
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.0.0, 1.4.0
>Reporter: churro morales
>Assignee: Karan Mehta
> Attachments: HBASE-17698.patch
>
>
> The only time we choose new sinks is when we have a ConnectException, but we 
> have encountered other exceptions where there is a problem contacting a 
> particular sink and replication gets backed up for any sources that try that 
> sink
> HBASE-17675 occurred when there was a bad keytab refresh and the source was 
> stuck.
> Another issue we recently had was a bad drive controller on the sink side and 
> replication was stuck again.  
> Is there any reason not to choose new sinks anytime we have a 
> RemoteException?  I can understand TableNotFound we don't have to choose new 
> sinks, but for all other cases this seems like the safest approach.  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17698) ReplicationEndpoint choosing sinks

2017-03-06 Thread Karan Mehta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karan Mehta updated HBASE-17698:

Status: Patch Available  (was: Open)

> ReplicationEndpoint choosing sinks
> --
>
> Key: HBASE-17698
> URL: https://issues.apache.org/jira/browse/HBASE-17698
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.0.0, 1.4.0
>Reporter: churro morales
>Assignee: Karan Mehta
> Attachments: HBASE-17698.patch
>
>
> The only time we choose new sinks is when we have a ConnectException, but we 
> have encountered other exceptions where there is a problem contacting a 
> particular sink and replication gets backed up for any sources that try that 
> sink
> HBASE-17675 occurred when there was a bad keytab refresh and the source was 
> stuck.
> Another issue we recently had was a bad drive controller on the sink side and 
> replication was stuck again.  
> Is there any reason not to choose new sinks anytime we have a 
> RemoteException?  I can understand TableNotFound we don't have to choose new 
> sinks, but for all other cases this seems like the safest approach.  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17716) Formalize Scan Metric names

2017-03-06 Thread Karan Mehta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karan Mehta updated HBASE-17716:

Attachment: HBASE-17716_v2.patch

> Formalize Scan Metric names
> ---
>
> Key: HBASE-17716
> URL: https://issues.apache.org/jira/browse/HBASE-17716
> Project: HBase
>  Issue Type: Bug
>  Components: metrics
>Reporter: Karan Mehta
>Assignee: Karan Mehta
>Priority: Minor
> Attachments: HBASE-17716.patch, HBASE-17716_v2.patch
>
>
> HBase provides various metrics through the API's exposed by ScanMetrics 
> class. 
> The JIRA PHOENIX-3248 requires them to be surfaced through the Phoenix 
> Metrics API. Currently these metrics are referred via hard-coded strings, 
> which are not formal and can break the Phoenix API. Hence we need to refactor 
> the code to assign enums for these metrics.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17716) Formalize Scan Metric names

2017-03-06 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15898614#comment-15898614
 ] 

Karan Mehta commented on HBASE-17716:
-

Added a patch which appends the metric names strings with the suffix 
"_METRIC_NAME" and made them as public static final.

> Formalize Scan Metric names
> ---
>
> Key: HBASE-17716
> URL: https://issues.apache.org/jira/browse/HBASE-17716
> Project: HBase
>  Issue Type: Bug
>  Components: metrics
>Reporter: Karan Mehta
>Assignee: Karan Mehta
>Priority: Minor
> Attachments: HBASE-17716.patch, HBASE-17716_v2.patch
>
>
> HBase provides various metrics through the API's exposed by ScanMetrics 
> class. 
> The JIRA PHOENIX-3248 requires them to be surfaced through the Phoenix 
> Metrics API. Currently these metrics are referred via hard-coded strings, 
> which are not formal and can break the Phoenix API. Hence we need to refactor 
> the code to assign enums for these metrics.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line

2017-03-07 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900044#comment-15900044
 ] 

Karan Mehta commented on HBASE-14925:
-

Yes, I can take it up.

> Develop HBase shell command/tool to list table's region info through command 
> line
> -
>
> Key: HBASE-14925
> URL: https://issues.apache.org/jira/browse/HBASE-14925
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Romil Choksi
>Assignee: huaxiang sun
>
> I am going through the hbase shell commands to see if there is anything I can 
> use to get all the regions info just for a particular table. I don’t see any 
> such command that provides me that information.
> It would be better to have a command that provides region info, start key, 
> end key etc taking a table name as the input parameter. This is available 
> through HBase UI on clicking on a particular table's link
> A tool/shell command to get a list of regions for a table or all tables in a 
> tabular structured output (that is machine readable)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line

2017-03-08 Thread Karan Mehta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karan Mehta reassigned HBASE-14925:
---

Assignee: Karan Mehta  (was: huaxiang sun)

> Develop HBase shell command/tool to list table's region info through command 
> line
> -
>
> Key: HBASE-14925
> URL: https://issues.apache.org/jira/browse/HBASE-14925
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Romil Choksi
>Assignee: Karan Mehta
>
> I am going through the hbase shell commands to see if there is anything I can 
> use to get all the regions info just for a particular table. I don’t see any 
> such command that provides me that information.
> It would be better to have a command that provides region info, start key, 
> end key etc taking a table name as the input parameter. This is available 
> through HBase UI on clicking on a particular table's link
> A tool/shell command to get a list of regions for a table or all tables in a 
> tabular structured output (that is machine readable)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line

2017-03-08 Thread Karan Mehta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karan Mehta updated HBASE-14925:

Attachment: HBASE-14925.patch

> Develop HBase shell command/tool to list table's region info through command 
> line
> -
>
> Key: HBASE-14925
> URL: https://issues.apache.org/jira/browse/HBASE-14925
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Romil Choksi
>Assignee: Karan Mehta
> Attachments: HBASE-14925.patch
>
>
> I am going through the hbase shell commands to see if there is anything I can 
> use to get all the regions info just for a particular table. I don’t see any 
> such command that provides me that information.
> It would be better to have a command that provides region info, start key, 
> end key etc taking a table name as the input parameter. This is available 
> through HBase UI on clicking on a particular table's link
> A tool/shell command to get a list of regions for a table or all tables in a 
> tabular structured output (that is machine readable)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line

2017-03-08 Thread Karan Mehta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karan Mehta updated HBASE-14925:

Status: Patch Available  (was: In Progress)

> Develop HBase shell command/tool to list table's region info through command 
> line
> -
>
> Key: HBASE-14925
> URL: https://issues.apache.org/jira/browse/HBASE-14925
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Romil Choksi
>Assignee: Karan Mehta
> Attachments: HBASE-14925.patch
>
>
> I am going through the hbase shell commands to see if there is anything I can 
> use to get all the regions info just for a particular table. I don’t see any 
> such command that provides me that information.
> It would be better to have a command that provides region info, start key, 
> end key etc taking a table name as the input parameter. This is available 
> through HBase UI on clicking on a particular table's link
> A tool/shell command to get a list of regions for a table or all tables in a 
> tabular structured output (that is machine readable)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17698) ReplicationEndpoint choosing sinks

2017-03-15 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15926492#comment-15926492
 ] 

Karan Mehta commented on HBASE-17698:
-

Hey [~apurtell], I have made the changes as suggested. Can you have a look? 
Thanks!

> ReplicationEndpoint choosing sinks
> --
>
> Key: HBASE-17698
> URL: https://issues.apache.org/jira/browse/HBASE-17698
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 2.0.0, 1.4.0
>Reporter: churro morales
>Assignee: Karan Mehta
> Attachments: HBASE-17698.patch
>
>
> The only time we choose new sinks is when we have a ConnectException, but we 
> have encountered other exceptions where there is a problem contacting a 
> particular sink and replication gets backed up for any sources that try that 
> sink
> HBASE-17675 occurred when there was a bad keytab refresh and the source was 
> stuck.
> Another issue we recently had was a bad drive controller on the sink side and 
> replication was stuck again.  
> Is there any reason not to choose new sinks anytime we have a 
> RemoteException?  I can understand TableNotFound we don't have to choose new 
> sinks, but for all other cases this seems like the safest approach.  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line

2017-03-15 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15926502#comment-15926502
 ] 

Karan Mehta commented on HBASE-14925:
-

Hey [~stack], I have implemented this method as per Ronan's suggestions and 
submitted the patch. Can you please have a look? Thanks!

> Develop HBase shell command/tool to list table's region info through command 
> line
> -
>
> Key: HBASE-14925
> URL: https://issues.apache.org/jira/browse/HBASE-14925
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Romil Choksi
>Assignee: Karan Mehta
> Attachments: HBASE-14925.patch
>
>
> I am going through the hbase shell commands to see if there is anything I can 
> use to get all the regions info just for a particular table. I don’t see any 
> such command that provides me that information.
> It would be better to have a command that provides region info, start key, 
> end key etc taking a table name as the input parameter. This is available 
> through HBase UI on clicking on a particular table's link
> A tool/shell command to get a list of regions for a table or all tables in a 
> tabular structured output (that is machine readable)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line

2017-04-03 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15954480#comment-15954480
 ] 

Karan Mehta commented on HBASE-14925:
-

[~ashish singhi]

{code}
  def command(table_name, region_server_name = "")
admin_instance = admin.instance_variable_get("@admin")
conn_instance = admin_instance.getConnection()
cluster_status = admin_instance.getClusterStatus()
hregion_locator_instance = 
conn_instance.getRegionLocator(TableName.valueOf(table_name))
list = hregion_locator_instance.getAllRegionLocations()
results = Array.new
begin  
  list.each do |hregion|
if hregion.getServerName().toString.start_with? region_server_name
  startKey = Bytes.toString(hregion.getRegionInfo().getStartKey())
  endKey = Bytes.toString(hregion.getRegionInfo().getEndKey())
  puts "All = #{hregion} , Start = #{startKey}, End = #{endKey}"
end
  end
ensure
  hregion_locator_instance.close()
end
{code}

How does this code seem? It filters the regions by user provided {{Server 
Name}} as prefix.


> Develop HBase shell command/tool to list table's region info through command 
> line
> -
>
> Key: HBASE-14925
> URL: https://issues.apache.org/jira/browse/HBASE-14925
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Romil Choksi
>Assignee: Karan Mehta
> Attachments: HBASE-14925.patch
>
>
> I am going through the hbase shell commands to see if there is anything I can 
> use to get all the regions info just for a particular table. I don’t see any 
> such command that provides me that information.
> It would be better to have a command that provides region info, start key, 
> end key etc taking a table name as the input parameter. This is available 
> through HBase UI on clicking on a particular table's link
> A tool/shell command to get a list of regions for a table or all tables in a 
> tabular structured output (that is machine readable)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line

2017-04-04 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15955753#comment-15955753
 ] 

Karan Mehta commented on HBASE-14925:
-

Thanks [~rstokes] for pointing out the use cases. The use cases typically 
suggest that the command will be not be used really often. We typically want 
the data to be fast, but no specific hard deadlines.

This code does consider the fact that if region server is not supplied, then 
information should be returned for all regions. {{start_with}} would match all 
the results if provided with an empty string.

This patch is actually optimized for queries that just want all the regions of 
a particular table. For the other part, if a typical table has many regions, 
then we will have to iterate through each of them and find the relevant ones, 
which may be expensive if a single table has plenty of regions. The approach is 
similar to the code that {{table.jsp}} uses to display the information in 
WEB-UI, the same API's are used. If WEB-UI seems appropriate in terms of 
latency, this one also should not be, I feel. Please suggest your comments on 
this one. 

Moreover, if the following approach is followed as previous patch 
{code}
   for server in cluster_status.getServers()
  for name,region in cluster_status.getLoad(server).getRegionsLoad()
  region_name = region.getNameAsString()
  regionStoreFileSize = region.getStorefileSizeMB()
  regionRequests = region.getRequestsCount()
  if region_name.start_with? tgtTable
  results << { "server" => server, "name" => region_name, "size" => 
regionStoreFileSize, "requests" => regionRequests }
  end
  end
{code}

It is difficult to retrieve the {{EndKey}} of a particular region since 
{{HRegionInfo}} is not present anywhere. {{StartKey}} is available if we call 
{{getLoad(ServerName).getRegionsLoad()}}, retrieve the region name and parse 
it. I believe {{EndKey}} is something that we are required to display as well. 
This patch can be used if we want to optimize the queries that filter by both 
tableName and regionServerName since that condition check can be done before 
the start of second loop. 

> Develop HBase shell command/tool to list table's region info through command 
> line
> -
>
> Key: HBASE-14925
> URL: https://issues.apache.org/jira/browse/HBASE-14925
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Romil Choksi
>Assignee: Karan Mehta
> Attachments: HBASE-14925.patch
>
>
> I am going through the hbase shell commands to see if there is anything I can 
> use to get all the regions info just for a particular table. I don’t see any 
> such command that provides me that information.
> It would be better to have a command that provides region info, start key, 
> end key etc taking a table name as the input parameter. This is available 
> through HBase UI on clicking on a particular table's link
> A tool/shell command to get a list of regions for a table or all tables in a 
> tabular structured output (that is machine readable)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line

2017-04-04 Thread Karan Mehta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karan Mehta updated HBASE-14925:

Attachment: HBASE-14925.002.patch

> Develop HBase shell command/tool to list table's region info through command 
> line
> -
>
> Key: HBASE-14925
> URL: https://issues.apache.org/jira/browse/HBASE-14925
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Romil Choksi
>Assignee: Karan Mehta
> Attachments: HBASE-14925.002.patch, HBASE-14925.patch
>
>
> I am going through the hbase shell commands to see if there is anything I can 
> use to get all the regions info just for a particular table. I don’t see any 
> such command that provides me that information.
> It would be better to have a command that provides region info, start key, 
> end key etc taking a table name as the input parameter. This is available 
> through HBase UI on clicking on a particular table's link
> A tool/shell command to get a list of regions for a table or all tables in a 
> tabular structured output (that is machine readable)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line

2017-04-04 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15955953#comment-15955953
 ] 

Karan Mehta commented on HBASE-14925:
-

Added a new patch with required changes. Can you take a look [~rstokes] and 
[~ashish singhi]?

> Develop HBase shell command/tool to list table's region info through command 
> line
> -
>
> Key: HBASE-14925
> URL: https://issues.apache.org/jira/browse/HBASE-14925
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Romil Choksi
>Assignee: Karan Mehta
> Attachments: HBASE-14925.002.patch, HBASE-14925.patch
>
>
> I am going through the hbase shell commands to see if there is anything I can 
> use to get all the regions info just for a particular table. I don’t see any 
> such command that provides me that information.
> It would be better to have a command that provides region info, start key, 
> end key etc taking a table name as the input parameter. This is available 
> through HBase UI on clicking on a particular table's link
> A tool/shell command to get a list of regions for a table or all tables in a 
> tabular structured output (that is machine readable)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line

2017-04-05 Thread Karan Mehta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karan Mehta updated HBASE-14925:

Attachment: HBASE-14925.003.patch

> Develop HBase shell command/tool to list table's region info through command 
> line
> -
>
> Key: HBASE-14925
> URL: https://issues.apache.org/jira/browse/HBASE-14925
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Romil Choksi
>Assignee: Karan Mehta
> Attachments: HBASE-14925.002.patch, HBASE-14925.003.patch, 
> HBASE-14925.patch
>
>
> I am going through the hbase shell commands to see if there is anything I can 
> use to get all the regions info just for a particular table. I don’t see any 
> such command that provides me that information.
> It would be better to have a command that provides region info, start key, 
> end key etc taking a table name as the input parameter. This is available 
> through HBase UI on clicking on a particular table's link
> A tool/shell command to get a list of regions for a table or all tables in a 
> tabular structured output (that is machine readable)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line

2017-04-05 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15957799#comment-15957799
 ] 

Karan Mehta commented on HBASE-14925:
-

{quote}
Where are we using this ?
@end_time = Time.now
{quote}
This is to determine the time taken for the command to execute, the 
{{start_time}} variable is initialized before the call to this function.

Added a new patch with proper formatting now. Suggest if there are better ways 
to format it. The {{formatter.rb}} file doesn't help format if there are more 
than 2 columns to be displayed. 

> Develop HBase shell command/tool to list table's region info through command 
> line
> -
>
> Key: HBASE-14925
> URL: https://issues.apache.org/jira/browse/HBASE-14925
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Romil Choksi
>Assignee: Karan Mehta
> Attachments: HBASE-14925.002.patch, HBASE-14925.003.patch, 
> HBASE-14925.patch
>
>
> I am going through the hbase shell commands to see if there is anything I can 
> use to get all the regions info just for a particular table. I don’t see any 
> such command that provides me that information.
> It would be better to have a command that provides region info, start key, 
> end key etc taking a table name as the input parameter. This is available 
> through HBase UI on clicking on a particular table's link
> A tool/shell command to get a list of regions for a table or all tables in a 
> tabular structured output (that is machine readable)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line

2017-04-05 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15957799#comment-15957799
 ] 

Karan Mehta edited comment on HBASE-14925 at 4/5/17 9:46 PM:
-

{quote}
Where are we using this ?
@end_time = Time.now
{quote}
This is to determine the time taken for the command to execute, the 
{{start_time}} variable is initialized before the call to this function.

Added a new patch with proper formatting now. Please suggest if there are 
better ways to format it. This formatting may be little disrupted if table 
names are huge or viewing screen size is small, when a single row of data 
extends up to 2 or more lines. 
The {{formatter.rb}} file doesn't help format if there are more than 2 columns 
to be displayed. 


was (Author: karanmehta93):
{quote}
Where are we using this ?
@end_time = Time.now
{quote}
This is to determine the time taken for the command to execute, the 
{{start_time}} variable is initialized before the call to this function.

Added a new patch with proper formatting now. Suggest if there are better ways 
to format it. The {{formatter.rb}} file doesn't help format if there are more 
than 2 columns to be displayed. 

> Develop HBase shell command/tool to list table's region info through command 
> line
> -
>
> Key: HBASE-14925
> URL: https://issues.apache.org/jira/browse/HBASE-14925
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Romil Choksi
>Assignee: Karan Mehta
> Attachments: HBASE-14925.002.patch, HBASE-14925.003.patch, 
> HBASE-14925.patch
>
>
> I am going through the hbase shell commands to see if there is anything I can 
> use to get all the regions info just for a particular table. I don’t see any 
> such command that provides me that information.
> It would be better to have a command that provides region info, start key, 
> end key etc taking a table name as the input parameter. This is available 
> through HBase UI on clicking on a particular table's link
> A tool/shell command to get a list of regions for a table or all tables in a 
> tabular structured output (that is machine readable)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line

2017-04-06 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15959491#comment-15959491
 ] 

Karan Mehta commented on HBASE-14925:
-

bq. I could not find the start_time variable in the patch and also end_time 
variable value not being used anywhere in the patch. What am I missing ? Can 
you point me ?

Here is the code snippet from the {{commands.rb}} file. The method 
{{command_safe()}} is a wrapper to the actual command call which initializes 
the global variable {{start_time}}. It also automatically computes the value of 
{{end_time}} and displays the total execution time. We can override these 
values. Since I didn't want the output formatting and display time to be 
included in the actual execution of the command, I provided a value to that 
variable before the output stuff. Does that seem okay?

{code}
  #wrap an execution of cmd to catch hbase exceptions
  # cmd - command name to execute
  # args - arguments to pass to the command
  def command_safe(debug, cmd = :command, *args)
# Commands can overwrite start_time to skip time used in some kind of 
setup.
# See count.rb for example.
@start_time = Time.now
# send is internal ruby method to call 'cmd' with *args
#(everything is a message, so this is just the formal semantics to 
support that idiom)
translate_hbase_exceptions(*args) { send(cmd, *args) }
  rescue => e
rootCause = e
while rootCause != nil && rootCause.respond_to?(:cause) && 
rootCause.cause != nil
  rootCause = rootCause.cause
end
if @shell.interactive?
  puts
  puts "ERROR: #{rootCause}"
  puts "Backtrace: #{rootCause.backtrace.join("\n   ")}" if 
debug
  puts
  puts help
  puts
else
  raise rootCause
end
  ensure
# If end_time is not already set by the command, use current time.
@end_time ||= Time.now
formatter.output_str("Took %.4f seconds" % [@end_time - @start_time])
{code}

> Develop HBase shell command/tool to list table's region info through command 
> line
> -
>
> Key: HBASE-14925
> URL: https://issues.apache.org/jira/browse/HBASE-14925
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Romil Choksi
>Assignee: Karan Mehta
> Attachments: HBASE-14925.002.patch, HBASE-14925.003.patch, 
> HBASE-14925.patch
>
>
> I am going through the hbase shell commands to see if there is anything I can 
> use to get all the regions info just for a particular table. I don’t see any 
> such command that provides me that information.
> It would be better to have a command that provides region info, start key, 
> end key etc taking a table name as the input parameter. This is available 
> through HBase UI on clicking on a particular table's link
> A tool/shell command to get a list of regions for a table or all tables in a 
> tabular structured output (that is machine readable)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17965) Canary tool should print the regionserver name on failure

2017-04-26 Thread Karan Mehta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karan Mehta updated HBASE-17965:

Attachment: HBASE-17965.001.patch

> Canary tool should print the regionserver name on failure
> -
>
> Key: HBASE-17965
> URL: https://issues.apache.org/jira/browse/HBASE-17965
> Project: HBase
>  Issue Type: Task
>Reporter: churro morales
>Assignee: Karan Mehta
>Priority: Minor
> Attachments: HBASE-17965.001.patch
>
>
> It would be nice when we have a canary failure for a region to print the 
> associated regionserver's name in the log as well. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17965) Canary tool should print the regionserver name on failure

2017-04-26 Thread Karan Mehta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karan Mehta updated HBASE-17965:

Status: Patch Available  (was: Open)

> Canary tool should print the regionserver name on failure
> -
>
> Key: HBASE-17965
> URL: https://issues.apache.org/jira/browse/HBASE-17965
> Project: HBase
>  Issue Type: Task
>Reporter: churro morales
>Assignee: Karan Mehta
>Priority: Minor
> Attachments: HBASE-17965.001.patch
>
>
> It would be nice when we have a canary failure for a region to print the 
> associated regionserver's name in the log as well. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line

2017-04-26 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15985849#comment-15985849
 ] 

Karan Mehta commented on HBASE-14925:
-

Can you take a final look [~ashish singhi] and suggest changes?

> Develop HBase shell command/tool to list table's region info through command 
> line
> -
>
> Key: HBASE-14925
> URL: https://issues.apache.org/jira/browse/HBASE-14925
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Romil Choksi
>Assignee: Karan Mehta
> Attachments: HBASE-14925.002.patch, HBASE-14925.003.patch, 
> HBASE-14925.patch
>
>
> I am going through the hbase shell commands to see if there is anything I can 
> use to get all the regions info just for a particular table. I don’t see any 
> such command that provides me that information.
> It would be better to have a command that provides region info, start key, 
> end key etc taking a table name as the input parameter. This is available 
> through HBase UI on clicking on a particular table's link
> A tool/shell command to get a list of regions for a table or all tables in a 
> tabular structured output (that is machine readable)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line

2017-04-29 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15990017#comment-15990017
 ] 

Karan Mehta commented on HBASE-14925:
-

Thank you [~enis] and [~busbey] for your comments. I can look into the 
comments, however I am not completely sure as to how to address them.

1. Will update all usages of Bytes::toString() to Bytes:: toStringBinary()
2. The internal formatter class is not good enough to display it in proper 
tabular form, so I tried setting the width of column manually to a fixed size. 
Let me try out some other approach as well. I am not sure how the output would 
look if the screen size is smaller than the row size.
3. For projecting out specific columns, I can provide the addendum for the 
users to add extra parameters after the initial params. If the user doesn't 
provide any column, by default it will display all the data, else it will only 
display the required ones. Please suggest ways as to how user should provide 
the input. I believe that the command line argument parser is only able to 
parse data separated with commas.
4. I am not sure how release notes are handled since I am new to this, but I am 
happy to learn. Please provide me with relevant info.

> Develop HBase shell command/tool to list table's region info through command 
> line
> -
>
> Key: HBASE-14925
> URL: https://issues.apache.org/jira/browse/HBASE-14925
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Romil Choksi
>Assignee: Karan Mehta
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-14925.002.patch, HBASE-14925.003.patch, 
> HBASE-14925.patch
>
>
> I am going through the hbase shell commands to see if there is anything I can 
> use to get all the regions info just for a particular table. I don’t see any 
> such command that provides me that information.
> It would be better to have a command that provides region info, start key, 
> end key etc taking a table name as the input parameter. This is available 
> through HBase UI on clicking on a particular table's link
> A tool/shell command to get a list of regions for a table or all tables in a 
> tabular structured output (that is machine readable)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17973) Create shell command to identify regions with poor locality

2017-05-02 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15993800#comment-15993800
 ] 

Karan Mehta commented on HBASE-17973:
-

[~elserj]
The patch is committed already but the JIRA status is shown as unresolved. What 
is its current status? I need to put an addendum for HBASE-14925, but when I 
try running {{list_regions}} command, I get the error as 
{code}
ERROR: undefined method `filter' for #
{code}

Can you please look into this?

> Create shell command to identify regions with poor locality
> ---
>
> Key: HBASE-17973
> URL: https://issues.apache.org/jira/browse/HBASE-17973
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Josh Elser
>Assignee: Josh Elser
> Fix For: 2.0.0
>
> Attachments: HBASE-17973.001.patch, HBASE-17973.002.patch, 
> HBASE-17973.003.patch
>
>
> The data locality of regions often plays a large role in the efficiency of 
> HBase. Compactions are also expensive to execute, especially on very large 
> tables. The balancer can do a good job trying to maintain locality (when 
> tuned properly), but it is not perfect.
> This creates a less-than-desirable situation where it's a costly operation to 
> take a cluster with spotty poor locality (e.g. a small percentage of 
> regionservers with poor locality).
> We already have this information available via the {{ClusterStatus}} proto. 
> We can easily write a shell command that can present regions which are 
> lacking a certain percentage of locality.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17973) Create shell command to identify regions with poor locality

2017-05-02 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15993846#comment-15993846
 ] 

Karan Mehta commented on HBASE-17973:
-

[~elserj] 
I had a look at the comment earlier but was wondering if there is any internal 
{{filter}} method available on Arrays since there was none explicitly defined.
I didn't understand the function of this line and couldn't find relevant stuff 
online. Can you please clarify what is this for? 
{code}
+  regions = hregion_locator_list.filter do |hregion|
{code}

Thank you!


> Create shell command to identify regions with poor locality
> ---
>
> Key: HBASE-17973
> URL: https://issues.apache.org/jira/browse/HBASE-17973
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Josh Elser
>Assignee: Josh Elser
> Fix For: 2.0.0
>
> Attachments: HBASE-17973.001.patch, HBASE-17973.002.patch, 
> HBASE-17973.003.patch
>
>
> The data locality of regions often plays a large role in the efficiency of 
> HBase. Compactions are also expensive to execute, especially on very large 
> tables. The balancer can do a good job trying to maintain locality (when 
> tuned properly), but it is not perfect.
> This creates a less-than-desirable situation where it's a costly operation to 
> take a cluster with spotty poor locality (e.g. a small percentage of 
> regionservers with poor locality).
> We already have this information available via the {{ClusterStatus}} proto. 
> We can easily write a shell command that can present regions which are 
> lacking a certain percentage of locality.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17973) Create shell command to identify regions with poor locality

2017-05-02 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15993921#comment-15993921
 ] 

Karan Mehta commented on HBASE-17973:
-

Thank you for the clarification and the patch, [~elserj].
One more suggestion, could you just attach the patch you committed with this 
JIRA. It just makes things a little easier I feel. :)

> Create shell command to identify regions with poor locality
> ---
>
> Key: HBASE-17973
> URL: https://issues.apache.org/jira/browse/HBASE-17973
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Josh Elser
>Assignee: Josh Elser
> Fix For: 2.0.0
>
> Attachments: HBASE-17973.001.patch, HBASE-17973.002.patch, 
> HBASE-17973.003.patch
>
>
> The data locality of regions often plays a large role in the efficiency of 
> HBase. Compactions are also expensive to execute, especially on very large 
> tables. The balancer can do a good job trying to maintain locality (when 
> tuned properly), but it is not perfect.
> This creates a less-than-desirable situation where it's a costly operation to 
> take a cluster with spotty poor locality (e.g. a small percentage of 
> regionservers with poor locality).
> We already have this information available via the {{ClusterStatus}} proto. 
> We can easily write a shell command that can present regions which are 
> lacking a certain percentage of locality.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17973) Create shell command to identify regions with poor locality

2017-05-02 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15994221#comment-15994221
 ] 

Karan Mehta commented on HBASE-17973:
-

Yes, it is there. Sorry for that. I didn't observe that v3 was an addendum and 
not a complete patch. Thanks! 

> Create shell command to identify regions with poor locality
> ---
>
> Key: HBASE-17973
> URL: https://issues.apache.org/jira/browse/HBASE-17973
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Josh Elser
>Assignee: Josh Elser
> Fix For: 2.0.0
>
> Attachments: HBASE-17973.001.patch, HBASE-17973.002.patch, 
> HBASE-17973.003.patch
>
>
> The data locality of regions often plays a large role in the efficiency of 
> HBase. Compactions are also expensive to execute, especially on very large 
> tables. The balancer can do a good job trying to maintain locality (when 
> tuned properly), but it is not perfect.
> This creates a less-than-desirable situation where it's a costly operation to 
> take a cluster with spotty poor locality (e.g. a small percentage of 
> regionservers with poor locality).
> We already have this information available via the {{ClusterStatus}} proto. 
> We can easily write a shell command that can present regions which are 
> lacking a certain percentage of locality.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line

2017-05-03 Thread Karan Mehta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karan Mehta updated HBASE-14925:

Status: Patch Available  (was: Reopened)

> Develop HBase shell command/tool to list table's region info through command 
> line
> -
>
> Key: HBASE-14925
> URL: https://issues.apache.org/jira/browse/HBASE-14925
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Romil Choksi
>Assignee: Karan Mehta
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-14925.002.patch, 
> HBASE-14925.003.addendum.001.patch, HBASE-14925.003.patch, HBASE-14925.patch
>
>
> I am going through the hbase shell commands to see if there is anything I can 
> use to get all the regions info just for a particular table. I don’t see any 
> such command that provides me that information.
> It would be better to have a command that provides region info, start key, 
> end key etc taking a table name as the input parameter. This is available 
> through HBase UI on clicking on a particular table's link
> A tool/shell command to get a list of regions for a table or all tables in a 
> tabular structured output (that is machine readable)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line

2017-05-03 Thread Karan Mehta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karan Mehta updated HBASE-14925:

Attachment: HBASE-14925.003.addendum.001.patch

> Develop HBase shell command/tool to list table's region info through command 
> line
> -
>
> Key: HBASE-14925
> URL: https://issues.apache.org/jira/browse/HBASE-14925
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Romil Choksi
>Assignee: Karan Mehta
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-14925.002.patch, 
> HBASE-14925.003.addendum.001.patch, HBASE-14925.003.patch, HBASE-14925.patch
>
>
> I am going through the hbase shell commands to see if there is anything I can 
> use to get all the regions info just for a particular table. I don’t see any 
> such command that provides me that information.
> It would be better to have a command that provides region info, start key, 
> end key etc taking a table name as the input parameter. This is available 
> through HBase UI on clicking on a particular table's link
> A tool/shell command to get a list of regions for a table or all tables in a 
> tabular structured output (that is machine readable)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line

2017-05-03 Thread Karan Mehta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karan Mehta updated HBASE-14925:

Release Note: 
Added a shell command 'list_regions' for displaying the table's region info 
through command line.

List all regions for a particular table as an array and also filter 
them by server name (optional) as prefix
and maximum locality (optional). By default, it will return all the 
regions for the table with any locality.
The command displays server name, region name, start key, end key, size 
of the region in MB, number of requests
and the locality. The information can be projected out via an array as 
third parameter. By default all these information
is displayed. Possible array values are SERVER_NAME, REGION_NAME, 
START_KEY, END_KEY, SIZE, REQ and LOCALITY. Values
are not case sensitive. If you don't want to filter by server name, 
pass an empty hash / string as shown below.

Examples:
hbase> list_regions 'table_name'
hbase> list_regions 'table_name', 'server_name'
hbase> list_regions 'table_name', {SERVER_NAME => 'server_name', 
LOCALITY_THRESHOLD => 0.8}
hbase> list_regions 'table_name', {SERVER_NAME => 'server_name', 
LOCALITY_THRESHOLD => 0.8}, ['SERVER_NAME']
hbase> list_regions 'table_name', {}, ['SERVER_NAME', 'start_key']
hbase> list_regions 'table_name', '', ['SERVER_NAME', 'start_key']

> Develop HBase shell command/tool to list table's region info through command 
> line
> -
>
> Key: HBASE-14925
> URL: https://issues.apache.org/jira/browse/HBASE-14925
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Romil Choksi
>Assignee: Karan Mehta
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-14925.002.patch, 
> HBASE-14925.003.addendum.001.patch, HBASE-14925.003.patch, HBASE-14925.patch
>
>
> I am going through the hbase shell commands to see if there is anything I can 
> use to get all the regions info just for a particular table. I don’t see any 
> such command that provides me that information.
> It would be better to have a command that provides region info, start key, 
> end key etc taking a table name as the input parameter. This is available 
> through HBase UI on clicking on a particular table's link
> A tool/shell command to get a list of regions for a table or all tables in a 
> tabular structured output (that is machine readable)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line

2017-05-03 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15995664#comment-15995664
 ] 

Karan Mehta commented on HBASE-14925:
-

I have put up an addendum patch and added the release notes. Can you have a 
look and provide your comments?

> Develop HBase shell command/tool to list table's region info through command 
> line
> -
>
> Key: HBASE-14925
> URL: https://issues.apache.org/jira/browse/HBASE-14925
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Romil Choksi
>Assignee: Karan Mehta
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-14925.002.patch, 
> HBASE-14925.003.addendum.001.patch, HBASE-14925.003.patch, HBASE-14925.patch
>
>
> I am going through the hbase shell commands to see if there is anything I can 
> use to get all the regions info just for a particular table. I don’t see any 
> such command that provides me that information.
> It would be better to have a command that provides region info, start key, 
> end key etc taking a table name as the input parameter. This is available 
> through HBase UI on clicking on a particular table's link
> A tool/shell command to get a list of regions for a table or all tables in a 
> tabular structured output (that is machine readable)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HBASE-17998) Improve HBase RPC write throttling size estimation

2017-05-04 Thread Karan Mehta (JIRA)
Karan Mehta created HBASE-17998:
---

 Summary: Improve HBase RPC write throttling size estimation
 Key: HBASE-17998
 URL: https://issues.apache.org/jira/browse/HBASE-17998
 Project: HBase
  Issue Type: Improvement
Reporter: Karan Mehta
Assignee: Karan Mehta


Currently when RPC throttling, the size of each put is estimated using a 
hardcoded value 100 bytes. This can be improved by using the protobuf size as 
an estimate, without having to deserialize or do a big refactoring.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18000) Make sure we always return the scanner id with ScanResponse

2017-05-05 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998528#comment-15998528
 ] 

Karan Mehta commented on HBASE-18000:
-

Isn't the patch being applied to 1.3.1 version as well?

> Make sure we always return the scanner id with ScanResponse
> ---
>
> Key: HBASE-18000
> URL: https://issues.apache.org/jira/browse/HBASE-18000
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.4.0, 1.3.1
>Reporter: Lars Hofhansl
>Assignee: Duo Zhang
> Fix For: 2.0.0, 1.4.0, 1.3.2
>
> Attachments: HBASE-18000.patch
>
>
> Some external tooling (like OpenTSDB) relies on the scanner id to tie 
> asynchronous responses back to their requests.
> (see comments on HBASE-17489)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line

2017-05-05 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998960#comment-15998960
 ] 

Karan Mehta commented on HBASE-14925:
-

Thank you for committing, [~ashish singhi]
I will provide the addendum for branch-1 after I get a patch for HBASE-17973 
for branch-1 since it is dependent on it, or else there will be merge conflicts.

> Develop HBase shell command/tool to list table's region info through command 
> line
> -
>
> Key: HBASE-14925
> URL: https://issues.apache.org/jira/browse/HBASE-14925
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Romil Choksi
>Assignee: Karan Mehta
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-14925.002.patch, 
> HBASE-14925.003.addendum.001.patch, HBASE-14925.003.patch, HBASE-14925.patch
>
>
> I am going through the hbase shell commands to see if there is anything I can 
> use to get all the regions info just for a particular table. I don’t see any 
> such command that provides me that information.
> It would be better to have a command that provides region info, start key, 
> end key etc taking a table name as the input parameter. This is available 
> through HBase UI on clicking on a particular table's link
> A tool/shell command to get a list of regions for a table or all tables in a 
> tabular structured output (that is machine readable)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17973) Create shell command to identify regions with poor locality

2017-05-05 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15998962#comment-15998962
 ] 

Karan Mehta commented on HBASE-17973:
-

Hey [~elserj], Can you back-port the patch for {{branch-1}}? The addendum patch 
for HBASE-14925 (branch-1) is dependent on this.
Thanks!

> Create shell command to identify regions with poor locality
> ---
>
> Key: HBASE-17973
> URL: https://issues.apache.org/jira/browse/HBASE-17973
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Josh Elser
>Assignee: Josh Elser
> Fix For: 2.0.0
>
> Attachments: HBASE-17973.001.patch, HBASE-17973.002.patch, 
> HBASE-17973.003.patch
>
>
> The data locality of regions often plays a large role in the efficiency of 
> HBase. Compactions are also expensive to execute, especially on very large 
> tables. The balancer can do a good job trying to maintain locality (when 
> tuned properly), but it is not perfect.
> This creates a less-than-desirable situation where it's a costly operation to 
> take a cluster with spotty poor locality (e.g. a small percentage of 
> regionservers with poor locality).
> We already have this information available via the {{ClusterStatus}} proto. 
> We can easily write a shell command that can present regions which are 
> lacking a certain percentage of locality.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-17973) Create shell command to identify regions with poor locality

2017-05-05 Thread Karan Mehta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karan Mehta updated HBASE-17973:

Attachment: HBASE-17973.branch-1.001.patch

> Create shell command to identify regions with poor locality
> ---
>
> Key: HBASE-17973
> URL: https://issues.apache.org/jira/browse/HBASE-17973
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Josh Elser
>Assignee: Josh Elser
> Fix For: 2.0.0
>
> Attachments: HBASE-17973.001.patch, HBASE-17973.002.patch, 
> HBASE-17973.003.patch, HBASE-17973.branch-1.001.patch
>
>
> The data locality of regions often plays a large role in the efficiency of 
> HBase. Compactions are also expensive to execute, especially on very large 
> tables. The balancer can do a good job trying to maintain locality (when 
> tuned properly), but it is not perfect.
> This creates a less-than-desirable situation where it's a costly operation to 
> take a cluster with spotty poor locality (e.g. a small percentage of 
> regionservers with poor locality).
> We already have this information available via the {{ClusterStatus}} proto. 
> We can easily write a shell command that can present regions which are 
> lacking a certain percentage of locality.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17973) Create shell command to identify regions with poor locality

2017-05-05 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15999029#comment-15999029
 ] 

Karan Mehta commented on HBASE-17973:
-

[~apurtell] I have added a patch for branch-1. I have squashed all the patches 
and the addendum into a single patch. Hope its fine that way. Please review 
whenever convenient.

> Create shell command to identify regions with poor locality
> ---
>
> Key: HBASE-17973
> URL: https://issues.apache.org/jira/browse/HBASE-17973
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Josh Elser
>Assignee: Josh Elser
> Fix For: 2.0.0
>
> Attachments: HBASE-17973.001.patch, HBASE-17973.002.patch, 
> HBASE-17973.003.patch, HBASE-17973.branch-1.001.patch
>
>
> The data locality of regions often plays a large role in the efficiency of 
> HBase. Compactions are also expensive to execute, especially on very large 
> tables. The balancer can do a good job trying to maintain locality (when 
> tuned properly), but it is not perfect.
> This creates a less-than-desirable situation where it's a costly operation to 
> take a cluster with spotty poor locality (e.g. a small percentage of 
> regionservers with poor locality).
> We already have this information available via the {{ClusterStatus}} proto. 
> We can easily write a shell command that can present regions which are 
> lacking a certain percentage of locality.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-14925) Develop HBase shell command/tool to list table's region info through command line

2017-05-05 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15999042#comment-15999042
 ] 

Karan Mehta commented on HBASE-14925:
-

The addendum patch attached here will cleanly apply to {{branch-1}} after the 
{{branch-1}} patch for HBASE-17973 is committed.

> Develop HBase shell command/tool to list table's region info through command 
> line
> -
>
> Key: HBASE-14925
> URL: https://issues.apache.org/jira/browse/HBASE-14925
> Project: HBase
>  Issue Type: Improvement
>  Components: shell
>Reporter: Romil Choksi
>Assignee: Karan Mehta
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-14925.002.patch, 
> HBASE-14925.003.addendum.001.patch, HBASE-14925.003.patch, HBASE-14925.patch
>
>
> I am going through the hbase shell commands to see if there is anything I can 
> use to get all the regions info just for a particular table. I don’t see any 
> such command that provides me that information.
> It would be better to have a command that provides region info, start key, 
> end key etc taking a table name as the input parameter. This is available 
> through HBase UI on clicking on a particular table's link
> A tool/shell command to get a list of regions for a table or all tables in a 
> tabular structured output (that is machine readable)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-17998) Improve HBase RPC write throttling size estimation

2017-05-08 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16001881#comment-16001881
 ] 

Karan Mehta commented on HBASE-17998:
-

I did not find really feasible for this thing to happen because of following 
reasons.
1. Calling {{getSerializedSize()}} on a Protobuf to estimate its size will 
require to traverse over the complete protobuf data which we don't wanna do in 
the first place.
2. Size of Puts can be estimated based on the number of bytes received for the 
RPC Request. This information can be passed around with {{HBaseRpcController}} 
class. However, it is difficult to estimate the size of only {{Puts}} in case 
of a  {{MultiRequest}} since we just have a total buffer size which may include 
{{Gets}} and {{Scans}}. I am not sure if we should really be doing code changes 
for just this support.

Rather than randomly estimating the size, a slightly better approach might be 
to use a moving average based on the past requests that have been seen as 
suggested by [~vincentpoon]. Lets discuss if there are any other ways to 
accomplish this task.

> Improve HBase RPC write throttling size estimation
> --
>
> Key: HBASE-17998
> URL: https://issues.apache.org/jira/browse/HBASE-17998
> Project: HBase
>  Issue Type: Improvement
>Reporter: Karan Mehta
>Assignee: Karan Mehta
>
> Currently when RPC throttling, the size of each put is estimated using a 
> hardcoded value 100 bytes. This can be improved by using the protobuf size as 
> an estimate, without having to deserialize or do a big refactoring.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Reopened] (HBASE-18000) Make sure we always return the scanner id with ScanResponse

2017-05-10 Thread Karan Mehta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karan Mehta reopened HBASE-18000:
-

> Make sure we always return the scanner id with ScanResponse
> ---
>
> Key: HBASE-18000
> URL: https://issues.apache.org/jira/browse/HBASE-18000
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0, 1.4.0, 1.3.1
>Reporter: Lars Hofhansl
>Assignee: Duo Zhang
> Fix For: 2.0.0, 1.4.0, 1.3.2
>
> Attachments: HBASE-18000.patch
>
>
> Some external tooling (like OpenTSDB) relies on the scanner id to tie 
> asynchronous responses back to their requests.
> (see comments on HBASE-17489)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18000) Make sure we always return the scanner id with ScanResponse

2017-05-10 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16005665#comment-16005665
 ] 

Karan Mehta commented on HBASE-18000:
-

[~Apache9] 
This is a corner case. I feel that the bug is not fully resolved.

We have set the {{scannerId}} on the {{ScanResponse}} builder
{code}
  if (request.hasScannerId()) {
rsh = getRegionScanner(request);
isSmallScan = false;
// The downstream projects such as AsyncHBase in OpenTSDB need this 
value. See HBASE-18000
// for more details.
builder.setScannerId(request.getScannerId());
  } else {
{code}

and {{rsh = getRegionScanner(request);}} look like this
{code}
  if (request.hasCloseScanner() && request.getCloseScanner()) {
throw SCANNER_ALREADY_CLOSED;
  } else {
{code}

which implies that for a {{CloseScannerRequest}} an exception is thrown by this 
line. Thus, {{builder.setScannerId(request.getScannerId())}} is never executed. 
We thus send an empty {{ScanResponse}} for handling it as follows.
{code}
  if (e == SCANNER_ALREADY_CLOSED) {
// Now we will close scanner automatically if there are no more results 
for this region but
// the old client will still send a close request to us. Just ignore it 
and return.
return builder.build();
  }
{code}
Thus there is no {{scannerId}} added in the {{builder}}.

A simple possible fix is to do this. Please suggest. 
{code}
   if (request.hasScannerId()) {
-rsh = getRegionScanner(request);
-isSmallScan = false;
 // The downstream projects such as AsyncHBase in OpenTSDB need this 
value. See HBASE-18000
 // for more details.
 builder.setScannerId(request.getScannerId());
+rsh = getRegionScanner(request);
+isSmallScan = false;
   } else {
{code}

> Make sure we always return the scanner id with ScanResponse
> ---
>
> Key: HBASE-18000
> URL: https://issues.apache.org/jira/browse/HBASE-18000
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0, 1.4.0, 1.3.1
>Reporter: Lars Hofhansl
>Assignee: Duo Zhang
> Fix For: 2.0.0, 1.4.0, 1.3.2
>
> Attachments: HBASE-18000.patch
>
>
> Some external tooling (like OpenTSDB) relies on the scanner id to tie 
> asynchronous responses back to their requests.
> (see comments on HBASE-17489)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HBASE-18042) Client Compatibility breaks between versions 1.2 and 1.3

2017-05-12 Thread Karan Mehta (JIRA)
Karan Mehta created HBASE-18042:
---

 Summary: Client Compatibility breaks between versions 1.2 and 1.3
 Key: HBASE-18042
 URL: https://issues.apache.org/jira/browse/HBASE-18042
 Project: HBase
  Issue Type: Bug
Reporter: Karan Mehta


OpenTSDB uses AsyncHBase as its client, rather than using the traditional HBase 
Client. From version 1.2 to 1.3, the {{ClientProtos}} have been changed. Newer 
fields are added to {{ScanResponse}} proto.

For a typical Scan request in 1.2, would require caller to make an OpenScanner 
Request, GetNextRows Request and a CloseScanner Request, based on {{more_rows}} 
boolean field in the {{ScanResponse}} proto.

However, from 1.3, new parameter {{more_results_in_region}} was added, which 
limits the results per region. Therefore the client has to now manage sending 
all the requests for each region. Further more, if the results are exhausted 
from a particular region, the {{ScanResponse}} will set 
{{more_results_in_region}} to false, but {{more_results}} can still be true. 
Whenever the former is set to false, the {{RegionScanner}} will also be closed. 

OpenTSDB makes an OpenScanner Request and receives all its results in the first 
{{ScanResponse}} itself, thus creating a condition as described in above 
paragraph. Since {{more_rows}} is true, it will proceed to send next request at 
which point the {{RSRpcServices}} will throw {{UnknownScannerException}}. The 
protobuf client compatibility is maintained but expected behavior is modified.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18042) Client Compatibility breaks between versions 1.2 and 1.3

2017-05-12 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008497#comment-16008497
 ] 

Karan Mehta commented on HBASE-18042:
-

A simple solution for this is to not close the scanner even if there are no 
more results in region. I am not sure about other implications due to this 
though.
{{RSRpcServices.java}}
{code}
   addResults(builder, results, (PayloadCarryingRpcController) controller,
 RegionReplicaUtil.isDefaultReplica(region.getRegionInfo()));
-  if (!moreResults || !moreResultsInRegion || closeScanner) {
+  if (!moreResults || closeScanner) {
 scannerClosed = true;
 closeScanner(region, scanner, scannerName, context);
{code}

> Client Compatibility breaks between versions 1.2 and 1.3
> 
>
> Key: HBASE-18042
> URL: https://issues.apache.org/jira/browse/HBASE-18042
> Project: HBase
>  Issue Type: Bug
>Reporter: Karan Mehta
>
> OpenTSDB uses AsyncHBase as its client, rather than using the traditional 
> HBase Client. From version 1.2 to 1.3, the {{ClientProtos}} have been 
> changed. Newer fields are added to {{ScanResponse}} proto.
> For a typical Scan request in 1.2, would require caller to make an 
> OpenScanner Request, GetNextRows Request and a CloseScanner Request, based on 
> {{more_rows}} boolean field in the {{ScanResponse}} proto.
> However, from 1.3, new parameter {{more_results_in_region}} was added, which 
> limits the results per region. Therefore the client has to now manage sending 
> all the requests for each region. Further more, if the results are exhausted 
> from a particular region, the {{ScanResponse}} will set 
> {{more_results_in_region}} to false, but {{more_results}} can still be true. 
> Whenever the former is set to false, the {{RegionScanner}} will also be 
> closed. 
> OpenTSDB makes an OpenScanner Request and receives all its results in the 
> first {{ScanResponse}} itself, thus creating a condition as described in 
> above paragraph. Since {{more_rows}} is true, it will proceed to send next 
> request at which point the {{RSRpcServices}} will throw 
> {{UnknownScannerException}}. The protobuf client compatibility is maintained 
> but expected behavior is modified.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HBASE-18042) Client Compatibility breaks between versions 1.2 and 1.3

2017-05-12 Thread Karan Mehta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karan Mehta reassigned HBASE-18042:
---

Assignee: Karan Mehta

> Client Compatibility breaks between versions 1.2 and 1.3
> 
>
> Key: HBASE-18042
> URL: https://issues.apache.org/jira/browse/HBASE-18042
> Project: HBase
>  Issue Type: Bug
>Reporter: Karan Mehta
>Assignee: Karan Mehta
>
> OpenTSDB uses AsyncHBase as its client, rather than using the traditional 
> HBase Client. From version 1.2 to 1.3, the {{ClientProtos}} have been 
> changed. Newer fields are added to {{ScanResponse}} proto.
> For a typical Scan request in 1.2, would require caller to make an 
> OpenScanner Request, GetNextRows Request and a CloseScanner Request, based on 
> {{more_rows}} boolean field in the {{ScanResponse}} proto.
> However, from 1.3, new parameter {{more_results_in_region}} was added, which 
> limits the results per region. Therefore the client has to now manage sending 
> all the requests for each region. Further more, if the results are exhausted 
> from a particular region, the {{ScanResponse}} will set 
> {{more_results_in_region}} to false, but {{more_results}} can still be true. 
> Whenever the former is set to false, the {{RegionScanner}} will also be 
> closed. 
> OpenTSDB makes an OpenScanner Request and receives all its results in the 
> first {{ScanResponse}} itself, thus creating a condition as described in 
> above paragraph. Since {{more_rows}} is true, it will proceed to send next 
> request at which point the {{RSRpcServices}} will throw 
> {{UnknownScannerException}}. The protobuf client compatibility is maintained 
> but expected behavior is modified.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-18042) Client Compatibility breaks between versions 1.2 and 1.3

2017-05-12 Thread Karan Mehta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karan Mehta updated HBASE-18042:

Affects Version/s: 1.3.1

> Client Compatibility breaks between versions 1.2 and 1.3
> 
>
> Key: HBASE-18042
> URL: https://issues.apache.org/jira/browse/HBASE-18042
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.1
>Reporter: Karan Mehta
>Assignee: Karan Mehta
>
> OpenTSDB uses AsyncHBase as its client, rather than using the traditional 
> HBase Client. From version 1.2 to 1.3, the {{ClientProtos}} have been 
> changed. Newer fields are added to {{ScanResponse}} proto.
> For a typical Scan request in 1.2, would require caller to make an 
> OpenScanner Request, GetNextRows Request and a CloseScanner Request, based on 
> {{more_rows}} boolean field in the {{ScanResponse}} proto.
> However, from 1.3, new parameter {{more_results_in_region}} was added, which 
> limits the results per region. Therefore the client has to now manage sending 
> all the requests for each region. Further more, if the results are exhausted 
> from a particular region, the {{ScanResponse}} will set 
> {{more_results_in_region}} to false, but {{more_results}} can still be true. 
> Whenever the former is set to false, the {{RegionScanner}} will also be 
> closed. 
> OpenTSDB makes an OpenScanner Request and receives all its results in the 
> first {{ScanResponse}} itself, thus creating a condition as described in 
> above paragraph. Since {{more_rows}} is true, it will proceed to send next 
> request at which point the {{RSRpcServices}} will throw 
> {{UnknownScannerException}}. The protobuf client compatibility is maintained 
> but expected behavior is modified.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18042) Client Compatibility breaks between versions 1.2 and 1.3

2017-05-13 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16009182#comment-16009182
 ] 

Karan Mehta commented on HBASE-18042:
-

HBASE-17489

> Client Compatibility breaks between versions 1.2 and 1.3
> 
>
> Key: HBASE-18042
> URL: https://issues.apache.org/jira/browse/HBASE-18042
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.1
>Reporter: Karan Mehta
>Assignee: Karan Mehta
>
> OpenTSDB uses AsyncHBase as its client, rather than using the traditional 
> HBase Client. From version 1.2 to 1.3, the {{ClientProtos}} have been 
> changed. Newer fields are added to {{ScanResponse}} proto.
> For a typical Scan request in 1.2, would require caller to make an 
> OpenScanner Request, GetNextRows Request and a CloseScanner Request, based on 
> {{more_rows}} boolean field in the {{ScanResponse}} proto.
> However, from 1.3, new parameter {{more_results_in_region}} was added, which 
> limits the results per region. Therefore the client has to now manage sending 
> all the requests for each region. Further more, if the results are exhausted 
> from a particular region, the {{ScanResponse}} will set 
> {{more_results_in_region}} to false, but {{more_results}} can still be true. 
> Whenever the former is set to false, the {{RegionScanner}} will also be 
> closed. 
> OpenTSDB makes an OpenScanner Request and receives all its results in the 
> first {{ScanResponse}} itself, thus creating a condition as described in 
> above paragraph. Since {{more_rows}} is true, it will proceed to send next 
> request at which point the {{RSRpcServices}} will throw 
> {{UnknownScannerException}}. The protobuf client compatibility is maintained 
> but expected behavior is modified.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18042) Client Compatibility breaks between versions 1.2 and 1.3

2017-05-15 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16011044#comment-16011044
 ] 

Karan Mehta commented on HBASE-18042:
-

bq. We already have this field for branch-1.2.
I missed that. Sorry about it.

But the behavior of {{closeScanner}} is modified between 1.2 and 1.3. Although 
we set {{more_results_in_region}} bit, we never close the scanner based on 
that. I am not sure about how the golden behavior should be in such a case.

bq. IIRC, the problem is that the official hbase client implementation does not 
handle {{more_results_in_region}} correctly which leads to one more request to 
RS but get nothing. I think this is a bug?

Can you provide some more insight into this? I am not fully aware about this. 
From what I understand, for 1.3,  if there is an OpenScanRequest for a region 
which returns all the results, then setting the {{more_results_in_region}} bit 
should help client to save one RPC request for CloseScanner since the scanner 
will be automatically closed on the server side. We do not get this advantage 
in 1.2. If any external client reads the {{more_results_in_region}} bit and 
doesn't send out the CloseScanRequest. then an open scanner will be lying 
around on server side wasting the resources. 

> Client Compatibility breaks between versions 1.2 and 1.3
> 
>
> Key: HBASE-18042
> URL: https://issues.apache.org/jira/browse/HBASE-18042
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.1
>Reporter: Karan Mehta
>Assignee: Karan Mehta
>
> OpenTSDB uses AsyncHBase as its client, rather than using the traditional 
> HBase Client. From version 1.2 to 1.3, the {{ClientProtos}} have been 
> changed. Newer fields are added to {{ScanResponse}} proto.
> For a typical Scan request in 1.2, would require caller to make an 
> OpenScanner Request, GetNextRows Request and a CloseScanner Request, based on 
> {{more_rows}} boolean field in the {{ScanResponse}} proto.
> However, from 1.3, new parameter {{more_results_in_region}} was added, which 
> limits the results per region. Therefore the client has to now manage sending 
> all the requests for each region. Further more, if the results are exhausted 
> from a particular region, the {{ScanResponse}} will set 
> {{more_results_in_region}} to false, but {{more_results}} can still be true. 
> Whenever the former is set to false, the {{RegionScanner}} will also be 
> closed. 
> OpenTSDB makes an OpenScanner Request and receives all its results in the 
> first {{ScanResponse}} itself, thus creating a condition as described in 
> above paragraph. Since {{more_rows}} is true, it will proceed to send next 
> request at which point the {{RSRpcServices}} will throw 
> {{UnknownScannerException}}. The protobuf client compatibility is maintained 
> but expected behavior is modified.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18042) Client Compatibility breaks between versions 1.2 and 1.3

2017-05-15 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16011147#comment-16011147
 ] 

Karan Mehta commented on HBASE-18042:
-

bq. So, more_rows was used to indicate more results in the region or more 
results overall in 1.2?
{{more_rows}} has always been used to indicate if there are any results pending 
overall.

bq. It may just be a bug in opentsdb itself to check more_results_in_region as 
well as more_rows.

I understand that OpenTSDB should handle {{more_results_in_region}}, but can we 
expect such a client-side behavior change between two minor releases? OpenTSDB 
works fine with 1.2 but not with 1.3.
[~apurtell]

> Client Compatibility breaks between versions 1.2 and 1.3
> 
>
> Key: HBASE-18042
> URL: https://issues.apache.org/jira/browse/HBASE-18042
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.1
>Reporter: Karan Mehta
>Assignee: Karan Mehta
>
> OpenTSDB uses AsyncHBase as its client, rather than using the traditional 
> HBase Client. From version 1.2 to 1.3, the {{ClientProtos}} have been 
> changed. Newer fields are added to {{ScanResponse}} proto.
> For a typical Scan request in 1.2, would require caller to make an 
> OpenScanner Request, GetNextRows Request and a CloseScanner Request, based on 
> {{more_rows}} boolean field in the {{ScanResponse}} proto.
> However, from 1.3, new parameter {{more_results_in_region}} was added, which 
> limits the results per region. Therefore the client has to now manage sending 
> all the requests for each region. Further more, if the results are exhausted 
> from a particular region, the {{ScanResponse}} will set 
> {{more_results_in_region}} to false, but {{more_results}} can still be true. 
> Whenever the former is set to false, the {{RegionScanner}} will also be 
> closed. 
> OpenTSDB makes an OpenScanner Request and receives all its results in the 
> first {{ScanResponse}} itself, thus creating a condition as described in 
> above paragraph. Since {{more_rows}} is true, it will proceed to send next 
> request at which point the {{RSRpcServices}} will throw 
> {{UnknownScannerException}}. The protobuf client compatibility is maintained 
> but expected behavior is modified.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18042) Client Compatibility breaks between versions 1.2 and 1.3

2017-05-16 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16012609#comment-16012609
 ] 

Karan Mehta commented on HBASE-18042:
-

bq. But it is ok to get any exception while closing a scanner because the 
ClientScanner will catch any exception from server in close().
There will be lot of unnecessary exceptions being generated and routed to the 
Client, which is not good. OpenTSDB does lot of RPC calls for fetching metrics 
data and it sometimes results in being winded up infinitely in these 
exceptions. If it's a single query then it might just work fine. Sometimes it 
returns duplicate results, since it sends out multiple OpenScannerRequests 
after getting confused by the exception. 

bq. I guess the problem for AsyncHBase is that it will also fetch data when 
opening a scanner, just like what we do in the new code in master and branch-1?
Yes that is true. See description for the exact behavior.

> Client Compatibility breaks between versions 1.2 and 1.3
> 
>
> Key: HBASE-18042
> URL: https://issues.apache.org/jira/browse/HBASE-18042
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.3.1
>Reporter: Karan Mehta
>Assignee: Karan Mehta
>
> OpenTSDB uses AsyncHBase as its client, rather than using the traditional 
> HBase Client. From version 1.2 to 1.3, the {{ClientProtos}} have been 
> changed. Newer fields are added to {{ScanResponse}} proto.
> For a typical Scan request in 1.2, would require caller to make an 
> OpenScanner Request, GetNextRows Request and a CloseScanner Request, based on 
> {{more_rows}} boolean field in the {{ScanResponse}} proto.
> However, from 1.3, new parameter {{more_results_in_region}} was added, which 
> limits the results per region. Therefore the client has to now manage sending 
> all the requests for each region. Further more, if the results are exhausted 
> from a particular region, the {{ScanResponse}} will set 
> {{more_results_in_region}} to false, but {{more_results}} can still be true. 
> Whenever the former is set to false, the {{RegionScanner}} will also be 
> closed. 
> OpenTSDB makes an OpenScanner Request and receives all its results in the 
> first {{ScanResponse}} itself, thus creating a condition as described in 
> above paragraph. Since {{more_rows}} is true, it will proceed to send next 
> request at which point the {{RSRpcServices}} will throw 
> {{UnknownScannerException}}. The protobuf client compatibility is maintained 
> but expected behavior is modified.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-11544) [Ergonomics] hbase.client.scanner.caching is dogged and will try to return batch even if it means OOME

2017-05-22 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16020554#comment-16020554
 ] 

Karan Mehta commented on HBASE-11544:
-

[~jonathan.lawlor] 
Can the server return multiple partial rows?
If yes, why does client side code assume that only the last result is partial 
in {{ClientScanner}} ?
If not, why do we have a repeated boolean value for {{partial_flag_per_result}} 
in {{Client.protos}} ?

> [Ergonomics] hbase.client.scanner.caching is dogged and will try to return 
> batch even if it means OOME
> --
>
> Key: HBASE-11544
> URL: https://issues.apache.org/jira/browse/HBASE-11544
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: Jonathan Lawlor
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: Allocation_Hot_Spots.html, gc.j.png, 
> HBASE-11544-addendum-v1.patch, HBASE-11544-addendum-v2.patch, 
> HBASE-11544-branch_1_0-v1.patch, HBASE-11544-branch_1_0-v2.patch, 
> HBASE-11544-v1.patch, HBASE-11544-v2.patch, HBASE-11544-v3.patch, 
> HBASE-11544-v4.patch, HBASE-11544-v5.patch, HBASE-11544-v6.patch, 
> HBASE-11544-v6.patch, HBASE-11544-v6.patch, HBASE-11544-v7.patch, 
> HBASE-11544-v8-branch-1.patch, HBASE-11544-v8.patch, hits.j.png, h.png, 
> mean.png, m.png, net.j.png, q (2).png
>
>
> Running some tests, I set hbase.client.scanner.caching=1000.  Dataset has 
> large cells.  I kept OOME'ing.
> Serverside, we should measure how much we've accumulated and return to the 
> client whatever we've gathered once we pass out a certain size threshold 
> rather than keep accumulating till we OOME.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HBASE-11544) [Ergonomics] hbase.client.scanner.caching is dogged and will try to return batch even if it means OOME

2017-05-23 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16020554#comment-16020554
 ] 

Karan Mehta edited comment on HBASE-11544 at 5/23/17 8:42 PM:
--

[~jonathan.lawlor]  [~stack]
Can the server return multiple partial rows?
If yes, why does client side code assume that only the last result is partial 
in {{ClientScanner}} ?
If not, why do we have a repeated boolean value for {{partial_flag_per_result}} 
in {{Client.protos}} ?


was (Author: karanmehta93):
[~jonathan.lawlor] 
Can the server return multiple partial rows?
If yes, why does client side code assume that only the last result is partial 
in {{ClientScanner}} ?
If not, why do we have a repeated boolean value for {{partial_flag_per_result}} 
in {{Client.protos}} ?

> [Ergonomics] hbase.client.scanner.caching is dogged and will try to return 
> batch even if it means OOME
> --
>
> Key: HBASE-11544
> URL: https://issues.apache.org/jira/browse/HBASE-11544
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: Jonathan Lawlor
>Priority: Critical
> Fix For: 2.0.0, 1.1.0
>
> Attachments: Allocation_Hot_Spots.html, gc.j.png, 
> HBASE-11544-addendum-v1.patch, HBASE-11544-addendum-v2.patch, 
> HBASE-11544-branch_1_0-v1.patch, HBASE-11544-branch_1_0-v2.patch, 
> HBASE-11544-v1.patch, HBASE-11544-v2.patch, HBASE-11544-v3.patch, 
> HBASE-11544-v4.patch, HBASE-11544-v5.patch, HBASE-11544-v6.patch, 
> HBASE-11544-v6.patch, HBASE-11544-v6.patch, HBASE-11544-v7.patch, 
> HBASE-11544-v8-branch-1.patch, HBASE-11544-v8.patch, hits.j.png, h.png, 
> mean.png, m.png, net.j.png, q (2).png
>
>
> Running some tests, I set hbase.client.scanner.caching=1000.  Dataset has 
> large cells.  I kept OOME'ing.
> Serverside, we should measure how much we've accumulated and return to the 
> client whatever we've gathered once we pass out a certain size threshold 
> rather than keep accumulating till we OOME.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HBASE-18097) Client can save 1 RPC call for CloseScannerRequest

2017-05-23 Thread Karan Mehta (JIRA)
Karan Mehta created HBASE-18097:
---

 Summary: Client can save 1 RPC call for CloseScannerRequest
 Key: HBASE-18097
 URL: https://issues.apache.org/jira/browse/HBASE-18097
 Project: HBase
  Issue Type: Improvement
Reporter: Karan Mehta


Starting version 1.3, HBase automatically closes scanner on server side 
whenever the results are exhausted and corresponding bits are set in the 
{{ScanResponse}} proto returned to the client. We can use that info to 
eliminate the closeScanRequest RPC call, thereby saving 1 RPC per region per 
scan. This can be particularly useful for tables with more regions.

Also, currently the {{ScanResponse}} proto sends out 1 bit per {{Result}} that 
it has embeds inside the {{CellScanner}} to indicate if it is partial or not. 
{code}
// In every RPC response there should be at most a single partial result. 
Furthermore, if
// there is a partial result, it is guaranteed to be in the last position 
of the array.
{code}
According to client, only the last result can be partial, thus this repeated 
bool can be converted to a bool, thus reducing overhead of serialization and 
deserialization of the array. This will break wire compatibility therefore this 
is something to look for in upcoming versions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18042) Client Compatibility breaks between versions 1.2 and 1.3

2017-05-24 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16023226#comment-16023226
 ] 

Karan Mehta commented on HBASE-18042:
-

[~Apache9]
Could you please clarify your explanation for the bug, along with the versions 
you are referring as {{old client}}?
Thanks

> Client Compatibility breaks between versions 1.2 and 1.3
> 
>
> Key: HBASE-18042
> URL: https://issues.apache.org/jira/browse/HBASE-18042
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, scan
>Affects Versions: 2.0.0, 1.4.0, 1.3.1
>Reporter: Karan Mehta
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 2.0.0, 1.4.0, 1.3.2
>
> Attachments: HBASE-18042-branch-1.patch, HBASE-18042-branch-1.patch, 
> HBASE-18042.patch, HBASE-18042-v1.patch
>
>
> OpenTSDB uses AsyncHBase as its client, rather than using the traditional 
> HBase Client. From version 1.2 to 1.3, the {{ClientProtos}} have been 
> changed. Newer fields are added to {{ScanResponse}} proto.
> For a typical Scan request in 1.2, would require caller to make an 
> OpenScanner Request, GetNextRows Request and a CloseScanner Request, based on 
> {{more_rows}} boolean field in the {{ScanResponse}} proto.
> However, from 1.3, new parameter {{more_results_in_region}} was added, which 
> limits the results per region. Therefore the client has to now manage sending 
> all the requests for each region. Further more, if the results are exhausted 
> from a particular region, the {{ScanResponse}} will set 
> {{more_results_in_region}} to false, but {{more_results}} can still be true. 
> Whenever the former is set to false, the {{RegionScanner}} will also be 
> closed. 
> OpenTSDB makes an OpenScanner Request and receives all its results in the 
> first {{ScanResponse}} itself, thus creating a condition as described in 
> above paragraph. Since {{more_rows}} is true, it will proceed to send next 
> request at which point the {{RSRpcServices}} will throw 
> {{UnknownScannerException}}. The protobuf client compatibility is maintained 
> but expected behavior is modified.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18042) Client Compatibility breaks between versions 1.2 and 1.3

2017-05-25 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16024324#comment-16024324
 ] 

Karan Mehta commented on HBASE-18042:
-

[~Apache9] Thanks for clarifying. Let me dig more into the code to understand 
it completely.

I have also put up a patch for AsyncHBase after upgrading the protos from 0.98 
to 1.3.
Link: https://github.com/OpenTSDB/opentsdb/pull/990
With the patch it should work fine, but in general it is good to keep the logic 
consistent.


> Client Compatibility breaks between versions 1.2 and 1.3
> 
>
> Key: HBASE-18042
> URL: https://issues.apache.org/jira/browse/HBASE-18042
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, scan
>Affects Versions: 2.0.0, 1.4.0, 1.3.1
>Reporter: Karan Mehta
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 2.0.0, 1.4.0, 1.3.2
>
> Attachments: HBASE-18042-branch-1.patch, HBASE-18042-branch-1.patch, 
> HBASE-18042.patch, HBASE-18042-v1.patch, HBASE-18042-v2.patch
>
>
> OpenTSDB uses AsyncHBase as its client, rather than using the traditional 
> HBase Client. From version 1.2 to 1.3, the {{ClientProtos}} have been 
> changed. Newer fields are added to {{ScanResponse}} proto.
> For a typical Scan request in 1.2, would require caller to make an 
> OpenScanner Request, GetNextRows Request and a CloseScanner Request, based on 
> {{more_rows}} boolean field in the {{ScanResponse}} proto.
> However, from 1.3, new parameter {{more_results_in_region}} was added, which 
> limits the results per region. Therefore the client has to now manage sending 
> all the requests for each region. Further more, if the results are exhausted 
> from a particular region, the {{ScanResponse}} will set 
> {{more_results_in_region}} to false, but {{more_results}} can still be true. 
> Whenever the former is set to false, the {{RegionScanner}} will also be 
> closed. 
> OpenTSDB makes an OpenScanner Request and receives all its results in the 
> first {{ScanResponse}} itself, thus creating a condition as described in 
> above paragraph. Since {{more_rows}} is true, it will proceed to send next 
> request at which point the {{RSRpcServices}} will throw 
> {{UnknownScannerException}}. The protobuf client compatibility is maintained 
> but expected behavior is modified.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HBASE-18117) Increase resiliency by allowing more parameters for online config change

2017-05-25 Thread Karan Mehta (JIRA)
Karan Mehta created HBASE-18117:
---

 Summary: Increase resiliency by allowing more parameters for 
online config change
 Key: HBASE-18117
 URL: https://issues.apache.org/jira/browse/HBASE-18117
 Project: HBase
  Issue Type: Improvement
Reporter: Karan Mehta


HBASE-8544 adds the feature to change config online without having a server 
restart. This JIRA is to work on new parameters for the utilizing that feature.

As [~apurtell] suggested, following are the useful and frequently changing 
parameters in production.

- RPC limits, timeouts, and other performance relevant settings
- Replication limits and batch sizes
- Region carrying limit
- WAL retention and cleaning parameters

I will try to make the RPC timeout parameter online as a part of this JIRA. If 
it seems suitable then we can extend it to other params.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HBASE-18117) Increase resiliency by allowing more parameters for online config change

2017-05-25 Thread Karan Mehta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karan Mehta reassigned HBASE-18117:
---

Assignee: Karan Mehta

> Increase resiliency by allowing more parameters for online config change
> 
>
> Key: HBASE-18117
> URL: https://issues.apache.org/jira/browse/HBASE-18117
> Project: HBase
>  Issue Type: Improvement
>Reporter: Karan Mehta
>Assignee: Karan Mehta
>
> HBASE-8544 adds the feature to change config online without having a server 
> restart. This JIRA is to work on new parameters for the utilizing that 
> feature.
> As [~apurtell] suggested, following are the useful and frequently changing 
> parameters in production.
> - RPC limits, timeouts, and other performance relevant settings
> - Replication limits and batch sizes
> - Region carrying limit
> - WAL retention and cleaning parameters
> I will try to make the RPC timeout parameter online as a part of this JIRA. 
> If it seems suitable then we can extend it to other params.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18117) Increase resiliency by allowing more parameters for online config change

2017-05-25 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16025640#comment-16025640
 ] 

Karan Mehta commented on HBASE-18117:
-

The current framework only allows changes on the config parameters that are 
accessed only on the server side. If {{ConfigurationObserver}} is implemented 
by any of the classes from {{hbase-client}}, this will introduce a cyclic 
dependency between hbase-client and hbase-server projects and thus the build 
would fail. 

> Increase resiliency by allowing more parameters for online config change
> 
>
> Key: HBASE-18117
> URL: https://issues.apache.org/jira/browse/HBASE-18117
> Project: HBase
>  Issue Type: Improvement
>Reporter: Karan Mehta
>Assignee: Karan Mehta
>
> HBASE-8544 adds the feature to change config online without having a server 
> restart. This JIRA is to work on new parameters for the utilizing that 
> feature.
> As [~apurtell] suggested, following are the useful and frequently changing 
> parameters in production.
> - RPC limits, timeouts, and other performance relevant settings
> - Replication limits and batch sizes
> - Region carrying limit
> - WAL retention and cleaning parameters
> I will try to make the RPC timeout parameter online as a part of this JIRA. 
> If it seems suitable then we can extend it to other params.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18117) Increase resiliency by allowing more parameters for online config change

2017-05-26 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16026518#comment-16026518
 ] 

Karan Mehta commented on HBASE-18117:
-

{{ConfigurationManager}} manages all the observers and is meant to be a 
singleton class, which is initialized inside the {{RSRpcServices}}. However, it 
is declared as a package protected and hence it is difficult to make it useful 
for other parameters which are being used by classes from different packages. A 
better approach is to move this framework from {{hbase-server}} to 
{{hbase-common}}. How does this approach seem? This framework can follow 
singleton design pattern as well if required.

> Increase resiliency by allowing more parameters for online config change
> 
>
> Key: HBASE-18117
> URL: https://issues.apache.org/jira/browse/HBASE-18117
> Project: HBase
>  Issue Type: Improvement
>Reporter: Karan Mehta
>Assignee: Karan Mehta
>
> HBASE-8544 adds the feature to change config online without having a server 
> restart. This JIRA is to work on new parameters for the utilizing that 
> feature.
> As [~apurtell] suggested, following are the useful and frequently changing 
> parameters in production.
> - RPC limits, timeouts, and other performance relevant settings
> - Replication limits and batch sizes
> - Region carrying limit
> - WAL retention and cleaning parameters
> I will try to make the RPC timeout parameter online as a part of this JIRA. 
> If it seems suitable then we can extend it to other params.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18117) Increase resiliency by allowing more parameters for online config change

2017-05-26 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16026864#comment-16026864
 ] 

Karan Mehta commented on HBASE-18117:
-

Another potential issue is to ensure that future uses of the online parameter 
will implement the {{ConfigurationObserver}}. I couldn't find any such 
enforcement in the current framework. Could you please confirm?
[~gaurav.menghani]

> Increase resiliency by allowing more parameters for online config change
> 
>
> Key: HBASE-18117
> URL: https://issues.apache.org/jira/browse/HBASE-18117
> Project: HBase
>  Issue Type: Improvement
>Reporter: Karan Mehta
>Assignee: Karan Mehta
>
> HBASE-8544 adds the feature to change config online without having a server 
> restart. This JIRA is to work on new parameters for the utilizing that 
> feature.
> As [~apurtell] suggested, following are the useful and frequently changing 
> parameters in production.
> - RPC limits, timeouts, and other performance relevant settings
> - Replication limits and batch sizes
> - Region carrying limit
> - WAL retention and cleaning parameters
> I will try to make the RPC timeout parameter online as a part of this JIRA. 
> If it seems suitable then we can extend it to other params.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18097) Client can save 1 RPC call for CloseScannerRequest

2017-05-29 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16028663#comment-16028663
 ] 

Karan Mehta commented on HBASE-18097:
-

The first part for saving 1 RPC request is already implemented as a part of 
HBASE-17508, where the scannerId is set to -1 whenever results are not left in 
region. 
For the second part related to ScanResponse proto, we can save some data in RPC 
call. 
[~Apache9] Do you feel it is reasonable to do so from the next version?

> Client can save 1 RPC call for CloseScannerRequest
> --
>
> Key: HBASE-18097
> URL: https://issues.apache.org/jira/browse/HBASE-18097
> Project: HBase
>  Issue Type: Improvement
>Reporter: Karan Mehta
>
> Starting version 1.3, HBase automatically closes scanner on server side 
> whenever the results are exhausted and corresponding bits are set in the 
> {{ScanResponse}} proto returned to the client. We can use that info to 
> eliminate the closeScanRequest RPC call, thereby saving 1 RPC per region per 
> scan. This can be particularly useful for tables with more regions.
> Also, currently the {{ScanResponse}} proto sends out 1 bit per {{Result}} 
> that it has embeds inside the {{CellScanner}} to indicate if it is partial or 
> not. 
> {code}
> // In every RPC response there should be at most a single partial result. 
> Furthermore, if
> // there is a partial result, it is guaranteed to be in the last position 
> of the array.
> {code}
> According to client, only the last result can be partial, thus this repeated 
> bool can be converted to a bool, thus reducing overhead of serialization and 
> deserialization of the array. This will break wire compatibility therefore 
> this is something to look for in upcoming versions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-18097) Client can save 1 RPC call for CloseScannerRequest

2017-05-30 Thread Karan Mehta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karan Mehta updated HBASE-18097:

Affects Version/s: 1.3.2

> Client can save 1 RPC call for CloseScannerRequest
> --
>
> Key: HBASE-18097
> URL: https://issues.apache.org/jira/browse/HBASE-18097
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.3.2
>Reporter: Karan Mehta
>
> Starting version 1.3, HBase automatically closes scanner on server side 
> whenever the results are exhausted and corresponding bits are set in the 
> {{ScanResponse}} proto returned to the client. We can use that info to 
> eliminate the closeScanRequest RPC call, thereby saving 1 RPC per region per 
> scan. This can be particularly useful for tables with more regions.
> Also, currently the {{ScanResponse}} proto sends out 1 bit per {{Result}} 
> that it has embeds inside the {{CellScanner}} to indicate if it is partial or 
> not. 
> {code}
> // In every RPC response there should be at most a single partial result. 
> Furthermore, if
> // there is a partial result, it is guaranteed to be in the last position 
> of the array.
> {code}
> According to client, only the last result can be partial, thus this repeated 
> bool can be converted to a bool, thus reducing overhead of serialization and 
> deserialization of the array. This will break wire compatibility therefore 
> this is something to look for in upcoming versions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-18097) Save bandwidth on partial_flag_per_result in ScanResponse proto

2017-05-31 Thread Karan Mehta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karan Mehta updated HBASE-18097:

Summary: Save bandwidth on partial_flag_per_result in ScanResponse proto  
(was: Client can save 1 RPC call for CloseScannerRequest)

> Save bandwidth on partial_flag_per_result in ScanResponse proto
> ---
>
> Key: HBASE-18097
> URL: https://issues.apache.org/jira/browse/HBASE-18097
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.3.2
>Reporter: Karan Mehta
>
> Starting version 1.3, HBase automatically closes scanner on server side 
> whenever the results are exhausted and corresponding bits are set in the 
> {{ScanResponse}} proto returned to the client. We can use that info to 
> eliminate the closeScanRequest RPC call, thereby saving 1 RPC per region per 
> scan. This can be particularly useful for tables with more regions.
> Also, currently the {{ScanResponse}} proto sends out 1 bit per {{Result}} 
> that it has embeds inside the {{CellScanner}} to indicate if it is partial or 
> not. 
> {code}
> // In every RPC response there should be at most a single partial result. 
> Furthermore, if
> // there is a partial result, it is guaranteed to be in the last position 
> of the array.
> {code}
> According to client, only the last result can be partial, thus this repeated 
> bool can be converted to a bool, thus reducing overhead of serialization and 
> deserialization of the array. This will break wire compatibility therefore 
> this is something to look for in upcoming versions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-18097) Save bandwidth on partial_flag_per_result in ScanResponse proto

2017-05-31 Thread Karan Mehta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karan Mehta updated HBASE-18097:

Description: 
Currently the {{ScanResponse}} proto sends out 1 bit per {{Result}} that it has 
embeds inside the {{CellScanner}} to indicate if it is partial or not. 
{code}
// In every RPC response there should be at most a single partial result. 
Furthermore, if
// there is a partial result, it is guaranteed to be in the last position 
of the array.
{code}
According to client, only the last result can be partial, thus this repeated 
bool can be converted to a bool, thus reducing overhead of serialization and 
deserialization of the array. This will break wire compatibility therefore this 
is something to look for in upcoming versions.

  was:
Starting version 1.3, HBase automatically closes scanner on server side 
whenever the results are exhausted and corresponding bits are set in the 
{{ScanResponse}} proto returned to the client. We can use that info to 
eliminate the closeScanRequest RPC call, thereby saving 1 RPC per region per 
scan. This can be particularly useful for tables with more regions.

Also, currently the {{ScanResponse}} proto sends out 1 bit per {{Result}} that 
it has embeds inside the {{CellScanner}} to indicate if it is partial or not. 
{code}
// In every RPC response there should be at most a single partial result. 
Furthermore, if
// there is a partial result, it is guaranteed to be in the last position 
of the array.
{code}
According to client, only the last result can be partial, thus this repeated 
bool can be converted to a bool, thus reducing overhead of serialization and 
deserialization of the array. This will break wire compatibility therefore this 
is something to look for in upcoming versions.


> Save bandwidth on partial_flag_per_result in ScanResponse proto
> ---
>
> Key: HBASE-18097
> URL: https://issues.apache.org/jira/browse/HBASE-18097
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 1.3.2
>Reporter: Karan Mehta
>
> Currently the {{ScanResponse}} proto sends out 1 bit per {{Result}} that it 
> has embeds inside the {{CellScanner}} to indicate if it is partial or not. 
> {code}
> // In every RPC response there should be at most a single partial result. 
> Furthermore, if
> // there is a partial result, it is guaranteed to be in the last position 
> of the array.
> {code}
> According to client, only the last result can be partial, thus this repeated 
> bool can be converted to a bool, thus reducing overhead of serialization and 
> deserialization of the array. This will break wire compatibility therefore 
> this is something to look for in upcoming versions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HBASE-18097) Save bandwidth on partial_flag_per_result in ScanResponse proto

2017-05-31 Thread Karan Mehta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karan Mehta updated HBASE-18097:

Affects Version/s: (was: 1.3.2)
   1.4.0
   2.0.0

> Save bandwidth on partial_flag_per_result in ScanResponse proto
> ---
>
> Key: HBASE-18097
> URL: https://issues.apache.org/jira/browse/HBASE-18097
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Karan Mehta
>
> Currently the {{ScanResponse}} proto sends out 1 bit per {{Result}} that it 
> has embeds inside the {{CellScanner}} to indicate if it is partial or not. 
> {code}
> // In every RPC response there should be at most a single partial result. 
> Furthermore, if
> // there is a partial result, it is guaranteed to be in the last position 
> of the array.
> {code}
> According to client, only the last result can be partial, thus this repeated 
> bool can be converted to a bool, thus reducing overhead of serialization and 
> deserialization of the array. This will break wire compatibility therefore 
> this is something to look for in upcoming versions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HBASE-18097) Save bandwidth on partial_flag_per_result in ScanResponse proto

2017-05-31 Thread Karan Mehta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karan Mehta reassigned HBASE-18097:
---

Assignee: Karan Mehta

> Save bandwidth on partial_flag_per_result in ScanResponse proto
> ---
>
> Key: HBASE-18097
> URL: https://issues.apache.org/jira/browse/HBASE-18097
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Karan Mehta
>Assignee: Karan Mehta
>
> Currently the {{ScanResponse}} proto sends out 1 bit per {{Result}} that it 
> has embeds inside the {{CellScanner}} to indicate if it is partial or not. 
> {code}
> // In every RPC response there should be at most a single partial result. 
> Furthermore, if
> // there is a partial result, it is guaranteed to be in the last position 
> of the array.
> {code}
> According to client, only the last result can be partial, thus this repeated 
> bool can be converted to a bool, thus reducing overhead of serialization and 
> deserialization of the array. This will break wire compatibility therefore 
> this is something to look for in upcoming versions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18097) Save bandwidth on partial_flag_per_result in ScanResponse proto

2017-06-04 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16036448#comment-16036448
 ] 

Karan Mehta commented on HBASE-18097:
-

The problem can occur if the client wants results in a specified batch size, in 
which case, the results can contain multiple partial results, which is then 
left to the user to handle appropriately, based on the {{partial}} flag inside 
the result. This is usually the case with AsyncHBaseClient.

> Save bandwidth on partial_flag_per_result in ScanResponse proto
> ---
>
> Key: HBASE-18097
> URL: https://issues.apache.org/jira/browse/HBASE-18097
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Karan Mehta
>Assignee: Karan Mehta
>
> Currently the {{ScanResponse}} proto sends out 1 bit per {{Result}} that it 
> has embeds inside the {{CellScanner}} to indicate if it is partial or not. 
> {code}
> // In every RPC response there should be at most a single partial result. 
> Furthermore, if
> // there is a partial result, it is guaranteed to be in the last position 
> of the array.
> {code}
> According to client, only the last result can be partial, thus this repeated 
> bool can be converted to a bool, thus reducing overhead of serialization and 
> deserialization of the array. This will break wire compatibility therefore 
> this is something to look for in upcoming versions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (HBASE-18097) Save bandwidth on partial_flag_per_result in ScanResponse proto

2017-06-07 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16036448#comment-16036448
 ] 

Karan Mehta edited comment on HBASE-18097 at 6/7/17 6:06 PM:
-

The problem can occur if the client wants results in a specified batch size, in 
which case, the results can contain multiple partial results, which is then 
left to the user to handle appropriately, based on the {{partial}} flag inside 
the result. This is usually the case with AsyncHBaseClient.
Any suggestions, [~enis] or [~Apache9] ?


was (Author: karanmehta93):
The problem can occur if the client wants results in a specified batch size, in 
which case, the results can contain multiple partial results, which is then 
left to the user to handle appropriately, based on the {{partial}} flag inside 
the result. This is usually the case with AsyncHBaseClient.

> Save bandwidth on partial_flag_per_result in ScanResponse proto
> ---
>
> Key: HBASE-18097
> URL: https://issues.apache.org/jira/browse/HBASE-18097
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0, 1.4.0
>Reporter: Karan Mehta
>Assignee: Karan Mehta
>
> Currently the {{ScanResponse}} proto sends out 1 bit per {{Result}} that it 
> has embeds inside the {{CellScanner}} to indicate if it is partial or not. 
> {code}
> // In every RPC response there should be at most a single partial result. 
> Furthermore, if
> // there is a partial result, it is guaranteed to be in the last position 
> of the array.
> {code}
> According to client, only the last result can be partial, thus this repeated 
> bool can be converted to a bool, thus reducing overhead of serialization and 
> deserialization of the array. This will break wire compatibility therefore 
> this is something to look for in upcoming versions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HBASE-18228) HBCK improvements

2017-06-19 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16054461#comment-16054461
 ] 

Karan Mehta commented on HBASE-18228:
-

[~lhofhansl] [~apurtell]
I can take it up.

> HBCK improvements
> -
>
> Key: HBASE-18228
> URL: https://issues.apache.org/jira/browse/HBASE-18228
> Project: HBase
>  Issue Type: Improvement
>  Components: hbck
>Reporter: Lars Hofhansl
>Priority: Critical
> Fix For: 1.4.0
>
>
> We just had a prod issue and running HBCK the way we did actually causes more 
> problems.
> In part HBCK did stuff we did not expect, in part we had little visibility 
> into what HBCK was doing, and in part the logging was confusing.
> I'm proposing 2 improvements:
> 1. A dry-run mode. Run, and just list what would have been done.
> 2. An interactive mode. Run, and for each action request Y/N user input. So 
> that a user can opt-out of stuff.
> [~jmhsieh], FYI



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HBASE-18228) HBCK improvements

2017-06-19 Thread Karan Mehta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karan Mehta reassigned HBASE-18228:
---

Assignee: Karan Mehta

> HBCK improvements
> -
>
> Key: HBASE-18228
> URL: https://issues.apache.org/jira/browse/HBASE-18228
> Project: HBase
>  Issue Type: Improvement
>  Components: hbck
>Reporter: Lars Hofhansl
>Assignee: Karan Mehta
>Priority: Critical
> Fix For: 1.4.0
>
>
> We just had a prod issue and running HBCK the way we did actually causes more 
> problems.
> In part HBCK did stuff we did not expect, in part we had little visibility 
> into what HBCK was doing, and in part the logging was confusing.
> I'm proposing 2 improvements:
> 1. A dry-run mode. Run, and just list what would have been done.
> 2. An interactive mode. Run, and for each action request Y/N user input. So 
> that a user can opt-out of stuff.
> [~jmhsieh], FYI



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18228) HBCK improvements

2017-06-26 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063613#comment-16063613
 ] 

Karan Mehta commented on HBASE-18228:
-

What should be the granularity of the operation? 
For example, running -fixAssigments or -fixHoles on a table, would run certain 
steps for the all the regions. Do we need to ask the user for individual step 
confirmation or for the command as a hole?
The pros are
 - More granularity, more power / flexibility to the user
The cons are
 - Lot of questions / decisions for user if the table has large number of 
regions
 - Hbck will run in parallel for every regionserver. The messages will be 
intermingled.
 - User might accidentally leave cluster in unhealthy state. For example, if 
the user decides to fix certain holes vs not fixing some of them in meta. 

The alternate option is to get user confirmation before every major step, which 
would help if switches like -repair is used, which internally performs bunch of 
other steps.

[~jmhsieh] [~lhofhansl] [~apurtell] [~churromorales] Please suggest.

> HBCK improvements
> -
>
> Key: HBASE-18228
> URL: https://issues.apache.org/jira/browse/HBASE-18228
> Project: HBase
>  Issue Type: Improvement
>  Components: hbck
>Reporter: Lars Hofhansl
>Assignee: Karan Mehta
>Priority: Critical
> Fix For: 1.4.0
>
>
> We just had a prod issue and running HBCK the way we did actually causes more 
> problems.
> In part HBCK did stuff we did not expect, in part we had little visibility 
> into what HBCK was doing, and in part the logging was confusing.
> I'm proposing 2 improvements:
> 1. A dry-run mode. Run, and just list what would have been done.
> 2. An interactive mode. Run, and for each action request Y/N user input. So 
> that a user can opt-out of stuff.
> [~jmhsieh], FYI



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HBASE-18228) HBCK improvements

2017-06-26 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063613#comment-16063613
 ] 

Karan Mehta edited comment on HBASE-18228 at 6/26/17 7:21 PM:
--

What should be the granularity of the operation? 
For example, running -fixAssigments or -fixHoles on a table, would run certain 
steps for the all the regions. Do we need to ask the user for individual step 
confirmation or for the command as a hole?
The pros are
* More granularity, more power / flexibility to the user

The cons are
* Lot of questions / decisions for user if the table has large number of regions
* Hbck will run in parallel for every regionserver. The messages will be 
intermingled.
* User might accidentally leave cluster in unhealthy state. For example, if the 
user decides to fix certain holes vs not fixing some of them in meta. 
* The alternate option is to get user confirmation before every major step, 
which would help if switches like -repair is used, which internally performs 
bunch of other steps.

[~jmhsieh] [~lhofhansl] [~apurtell] [~churromorales] Please suggest.


was (Author: karanmehta93):
What should be the granularity of the operation? 
For example, running -fixAssigments or -fixHoles on a table, would run certain 
steps for the all the regions. Do we need to ask the user for individual step 
confirmation or for the command as a hole?
The pros are
 - More granularity, more power / flexibility to the user
The cons are
 - Lot of questions / decisions for user if the table has large number of 
regions
 - Hbck will run in parallel for every regionserver. The messages will be 
intermingled.
 - User might accidentally leave cluster in unhealthy state. For example, if 
the user decides to fix certain holes vs not fixing some of them in meta. 

The alternate option is to get user confirmation before every major step, which 
would help if switches like -repair is used, which internally performs bunch of 
other steps.

[~jmhsieh] [~lhofhansl] [~apurtell] [~churromorales] Please suggest.

> HBCK improvements
> -
>
> Key: HBASE-18228
> URL: https://issues.apache.org/jira/browse/HBASE-18228
> Project: HBase
>  Issue Type: Improvement
>  Components: hbck
>Reporter: Lars Hofhansl
>Assignee: Karan Mehta
>Priority: Critical
> Fix For: 1.4.0
>
>
> We just had a prod issue and running HBCK the way we did actually causes more 
> problems.
> In part HBCK did stuff we did not expect, in part we had little visibility 
> into what HBCK was doing, and in part the logging was confusing.
> I'm proposing 2 improvements:
> 1. A dry-run mode. Run, and just list what would have been done.
> 2. An interactive mode. Run, and for each action request Y/N user input. So 
> that a user can opt-out of stuff.
> [~jmhsieh], FYI



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HBASE-18228) HBCK improvements

2017-06-26 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063613#comment-16063613
 ] 

Karan Mehta edited comment on HBASE-18228 at 6/26/17 7:22 PM:
--

What should be the granularity of the operation? 
For example, running -fixAssigments or -fixHoles on a table, would run certain 
steps for the all the regions. Do we need to ask the user for individual step 
confirmation or for the command as a hole?
The pros are
* More granularity, more power / flexibility to the user

The cons are
* Lot of questions / decisions for user if the table has large number of regions
* Hbck will run in parallel for every regionserver. The messages will be 
intermingled.
* User might accidentally leave cluster in unhealthy state. For example, if the 
user decides to fix certain holes vs not fixing some of them in meta. 

The alternate option is to get user confirmation before every major step, which 
would help if switches like -repair is used, which internally performs bunch of 
other steps.

[~jmhsieh] [~lhofhansl] [~apurtell] [~churromorales] Please suggest.


was (Author: karanmehta93):
What should be the granularity of the operation? 
For example, running -fixAssigments or -fixHoles on a table, would run certain 
steps for the all the regions. Do we need to ask the user for individual step 
confirmation or for the command as a hole?
The pros are
* More granularity, more power / flexibility to the user

The cons are
* Lot of questions / decisions for user if the table has large number of regions
* Hbck will run in parallel for every regionserver. The messages will be 
intermingled.
* User might accidentally leave cluster in unhealthy state. For example, if the 
user decides to fix certain holes vs not fixing some of them in meta. 
* The alternate option is to get user confirmation before every major step, 
which would help if switches like -repair is used, which internally performs 
bunch of other steps.

[~jmhsieh] [~lhofhansl] [~apurtell] [~churromorales] Please suggest.

> HBCK improvements
> -
>
> Key: HBASE-18228
> URL: https://issues.apache.org/jira/browse/HBASE-18228
> Project: HBase
>  Issue Type: Improvement
>  Components: hbck
>Reporter: Lars Hofhansl
>Assignee: Karan Mehta
>Priority: Critical
> Fix For: 1.4.0
>
>
> We just had a prod issue and running HBCK the way we did actually causes more 
> problems.
> In part HBCK did stuff we did not expect, in part we had little visibility 
> into what HBCK was doing, and in part the logging was confusing.
> I'm proposing 2 improvements:
> 1. A dry-run mode. Run, and just list what would have been done.
> 2. An interactive mode. Run, and for each action request Y/N user input. So 
> that a user can opt-out of stuff.
> [~jmhsieh], FYI



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18228) HBCK improvements

2017-06-30 Thread Karan Mehta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karan Mehta updated HBASE-18228:

Status: Patch Available  (was: Open)

> HBCK improvements
> -
>
> Key: HBASE-18228
> URL: https://issues.apache.org/jira/browse/HBASE-18228
> Project: HBase
>  Issue Type: Improvement
>  Components: hbck
>Reporter: Lars Hofhansl
>Assignee: Karan Mehta
>Priority: Critical
> Fix For: 1.4.0
>
> Attachments: HBASE-18228.branch-1.3.patch
>
>
> We just had a prod issue and running HBCK the way we did actually causes more 
> problems.
> In part HBCK did stuff we did not expect, in part we had little visibility 
> into what HBCK was doing, and in part the logging was confusing.
> I'm proposing 2 improvements:
> 1. A dry-run mode. Run, and just list what would have been done.
> 2. An interactive mode. Run, and for each action request Y/N user input. So 
> that a user can opt-out of stuff.
> [~jmhsieh], FYI



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18228) HBCK improvements

2017-06-30 Thread Karan Mehta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karan Mehta updated HBASE-18228:

Attachment: HBASE-18228.branch-1.3.patch

Patch includes the changes:
New switches for hbck
* {{-dryRun}} --> Runs HBCK without affecting anything. Also prints out what 
potential changes will this particular run make. It cannot output full detail 
since some operations can only be performed after the first one is done. 
*  {{-i}} --> Interactive HBCK. Asks for user input before every potential 
modification. For example, before fixing a particular hole in META, creating a 
new .regionInfo file etc. 

Also asks user confirmation for options such as -repair and -repairHoles which 
internally run several other switches.

[~apurtell] [~jmhsieh] [~mdrob] Please review. 

> HBCK improvements
> -
>
> Key: HBASE-18228
> URL: https://issues.apache.org/jira/browse/HBASE-18228
> Project: HBase
>  Issue Type: Improvement
>  Components: hbck
>Reporter: Lars Hofhansl
>Assignee: Karan Mehta
>Priority: Critical
> Fix For: 1.4.0
>
> Attachments: HBASE-18228.branch-1.3.patch
>
>
> We just had a prod issue and running HBCK the way we did actually causes more 
> problems.
> In part HBCK did stuff we did not expect, in part we had little visibility 
> into what HBCK was doing, and in part the logging was confusing.
> I'm proposing 2 improvements:
> 1. A dry-run mode. Run, and just list what would have been done.
> 2. An interactive mode. Run, and for each action request Y/N user input. So 
> that a user can opt-out of stuff.
> [~jmhsieh], FYI



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18228) HBCK improvements

2017-06-30 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16070747#comment-16070747
 ] 

Karan Mehta commented on HBASE-18228:
-

bq. Also, curious, why branch 1.3 specifically?
I will submit a patch for branch-1 as well. It's just that I started working 
with this branch and the scope is limited to 1.4 anyways for this patch.

[~mdrob] Added the review board link. 

> HBCK improvements
> -
>
> Key: HBASE-18228
> URL: https://issues.apache.org/jira/browse/HBASE-18228
> Project: HBase
>  Issue Type: Improvement
>  Components: hbck
>Reporter: Lars Hofhansl
>Assignee: Karan Mehta
>Priority: Critical
> Fix For: 1.4.0
>
> Attachments: HBASE-18228.branch-1.3.patch
>
>
> We just had a prod issue and running HBCK the way we did actually causes more 
> problems.
> In part HBCK did stuff we did not expect, in part we had little visibility 
> into what HBCK was doing, and in part the logging was confusing.
> I'm proposing 2 improvements:
> 1. A dry-run mode. Run, and just list what would have been done.
> 2. An interactive mode. Run, and for each action request Y/N user input. So 
> that a user can opt-out of stuff.
> [~jmhsieh], FYI



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HBASE-18228) HBCK improvements

2017-07-06 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16076809#comment-16076809
 ] 

Karan Mehta edited comment on HBASE-18228 at 7/6/17 4:08 PM:
-

[~mdrob] [~te...@apache.org]
Can you provide some feedback on my reply on review board?


was (Author: karanmehta93):
[~mdrob]
Can you provide some feedback on my reply on review board?

> HBCK improvements
> -
>
> Key: HBASE-18228
> URL: https://issues.apache.org/jira/browse/HBASE-18228
> Project: HBase
>  Issue Type: Improvement
>  Components: hbck
>Reporter: Lars Hofhansl
>Assignee: Karan Mehta
>Priority: Critical
> Fix For: 1.4.0
>
> Attachments: HBASE-18228.branch-1.3.patch
>
>
> We just had a prod issue and running HBCK the way we did actually causes more 
> problems.
> In part HBCK did stuff we did not expect, in part we had little visibility 
> into what HBCK was doing, and in part the logging was confusing.
> I'm proposing 2 improvements:
> 1. A dry-run mode. Run, and just list what would have been done.
> 2. An interactive mode. Run, and for each action request Y/N user input. So 
> that a user can opt-out of stuff.
> [~jmhsieh], FYI



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18228) HBCK improvements

2017-07-06 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16076809#comment-16076809
 ] 

Karan Mehta commented on HBASE-18228:
-

[~mdrob]
Can you provide some feedback on my reply on review board?

> HBCK improvements
> -
>
> Key: HBASE-18228
> URL: https://issues.apache.org/jira/browse/HBASE-18228
> Project: HBase
>  Issue Type: Improvement
>  Components: hbck
>Reporter: Lars Hofhansl
>Assignee: Karan Mehta
>Priority: Critical
> Fix For: 1.4.0
>
> Attachments: HBASE-18228.branch-1.3.patch
>
>
> We just had a prod issue and running HBCK the way we did actually causes more 
> problems.
> In part HBCK did stuff we did not expect, in part we had little visibility 
> into what HBCK was doing, and in part the logging was confusing.
> I'm proposing 2 improvements:
> 1. A dry-run mode. Run, and just list what would have been done.
> 2. An interactive mode. Run, and for each action request Y/N user input. So 
> that a user can opt-out of stuff.
> [~jmhsieh], FYI



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-18396) Encode ZNode names to reduce ZooKeeper jute buffer length requirements and thus reduce memory usage

2017-07-17 Thread Karan Mehta (JIRA)
Karan Mehta created HBASE-18396:
---

 Summary: Encode ZNode names to reduce ZooKeeper jute buffer length 
requirements and thus reduce memory usage
 Key: HBASE-18396
 URL: https://issues.apache.org/jira/browse/HBASE-18396
 Project: HBase
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: Karan Mehta


In our production environment, we hit the error {{ZooKeeper connectionLoss due 
to jute.maxbuffer len of 1M getting exceeded}}. Usually 1 MB is a lot, but in 
case of multi requests, it can exceed the maximum buffer length that is 
allocated.

This JIRA is a discussion for encoding various znode names. IMO, this will 
reduce the path lengths, thus reducing the size of buffer required as well as 
network packet size and also pack more requests in a single multi. As with 
encoding, this will introduce overhead, but we need to determine how feasible 
this idea is.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18396) Encode ZNode names to reduce ZooKeeper jute buffer length requirements and thus reduce memory usage

2017-08-08 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16118956#comment-16118956
 ] 

Karan Mehta commented on HBASE-18396:
-

[~mdrob] 
Could you please elaborate?

> Encode ZNode names to reduce ZooKeeper jute buffer length requirements and 
> thus reduce memory usage
> ---
>
> Key: HBASE-18396
> URL: https://issues.apache.org/jira/browse/HBASE-18396
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Karan Mehta
>
> In our production environment, we hit the error {{ZooKeeper connectionLoss 
> due to jute.maxbuffer len of 1M getting exceeded}}. Usually 1 MB is a lot, 
> but in case of multi requests, it can exceed the maximum buffer length that 
> is allocated.
> This JIRA is a discussion for encoding various znode names. IMO, this will 
> reduce the path lengths, thus reducing the size of buffer required as well as 
> network packet size and also pack more requests in a single multi. As with 
> encoding, this will introduce overhead, but we need to determine how feasible 
> this idea is.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18228) HBCK improvements

2018-03-07 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16389950#comment-16389950
 ] 

Karan Mehta commented on HBASE-18228:
-

[~lhofhansl] A patch was attempted for this JIRA. However it doesn't seem as 
useful as expected. Would you mind to discuss other potential improvements here?
FYI, this issue is specifically addressing branch-1.3 (possibly 1.4).

> HBCK improvements
> -
>
> Key: HBASE-18228
> URL: https://issues.apache.org/jira/browse/HBASE-18228
> Project: HBase
>  Issue Type: Improvement
>  Components: hbck
>Reporter: Lars Hofhansl
>Assignee: Karan Mehta
>Priority: Critical
> Fix For: 1.5.0, 1.4.3
>
> Attachments: HBASE-18228.branch-1.3.patch
>
>
> We just had a prod issue and running HBCK the way we did actually causes more 
> problems.
> In part HBCK did stuff we did not expect, in part we had little visibility 
> into what HBCK was doing, and in part the logging was confusing.
> I'm proposing 2 improvements:
> 1. A dry-run mode. Run, and just list what would have been done.
> 2. An interactive mode. Run, and for each action request Y/N user input. So 
> that a user can opt-out of stuff.
> [~jmhsieh], FYI



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-18097) Save bandwidth on partial_flag_per_result in ScanResponse proto

2017-11-07 Thread Karan Mehta (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242832#comment-16242832
 ] 

Karan Mehta commented on HBASE-18097:
-

Ping [~enis] [~Apache9]
Any thoughts or suggestions? Is the improvement worth it?

> Save bandwidth on partial_flag_per_result in ScanResponse proto
> ---
>
> Key: HBASE-18097
> URL: https://issues.apache.org/jira/browse/HBASE-18097
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0, 3.0.0, 1.5.0
>Reporter: Karan Mehta
>Assignee: Karan Mehta
>
> Currently the {{ScanResponse}} proto sends out 1 bit per {{Result}} that it 
> has embeds inside the {{CellScanner}} to indicate if it is partial or not. 
> {code}
> // In every RPC response there should be at most a single partial result. 
> Furthermore, if
> // there is a partial result, it is guaranteed to be in the last position 
> of the array.
> {code}
> According to client, only the last result can be partial, thus this repeated 
> bool can be converted to a bool, thus reducing overhead of serialization and 
> deserialization of the array. This will break wire compatibility therefore 
> this is something to look for in upcoming versions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HBASE-21553) schedLock not released ni MasterProcedureScheduler

2018-12-05 Thread Karan Mehta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karan Mehta reassigned HBASE-21553:
---

Assignee: Karan Mehta

> schedLock not released ni MasterProcedureScheduler
> --
>
> Key: HBASE-21553
> URL: https://issues.apache.org/jira/browse/HBASE-21553
> Project: HBase
>  Issue Type: Improvement
>Reporter: Xu Cang
>Assignee: Karan Mehta
>Priority: Major
>
> https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L749



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21553) schedLock not released in MasterProcedureScheduler

2018-12-05 Thread Karan Mehta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karan Mehta updated HBASE-21553:

Summary: schedLock not released in MasterProcedureScheduler  (was: 
schedLock not released ni MasterProcedureScheduler)

> schedLock not released in MasterProcedureScheduler
> --
>
> Key: HBASE-21553
> URL: https://issues.apache.org/jira/browse/HBASE-21553
> Project: HBase
>  Issue Type: Improvement
>Reporter: Xu Cang
>Priority: Major
>
> https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L749



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-21553) schedLock not released ni MasterProcedureScheduler

2018-12-05 Thread Karan Mehta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karan Mehta reassigned HBASE-21553:
---

Assignee: (was: Karan Mehta)

> schedLock not released ni MasterProcedureScheduler
> --
>
> Key: HBASE-21553
> URL: https://issues.apache.org/jira/browse/HBASE-21553
> Project: HBase
>  Issue Type: Improvement
>Reporter: Xu Cang
>Priority: Major
>
> https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L749



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-21553) schedLock not released ni MasterProcedureScheduler

2018-12-05 Thread Karan Mehta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karan Mehta reassigned HBASE-21553:
---

Assignee: Karan Mehta

> schedLock not released ni MasterProcedureScheduler
> --
>
> Key: HBASE-21553
> URL: https://issues.apache.org/jira/browse/HBASE-21553
> Project: HBase
>  Issue Type: Improvement
>Reporter: Xu Cang
>Assignee: Karan Mehta
>Priority: Major
>
> https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L749



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-21553) schedLock not released ni MasterProcedureScheduler

2018-12-05 Thread Karan Mehta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karan Mehta reassigned HBASE-21553:
---

Assignee: (was: Karan Mehta)

> schedLock not released ni MasterProcedureScheduler
> --
>
> Key: HBASE-21553
> URL: https://issues.apache.org/jira/browse/HBASE-21553
> Project: HBase
>  Issue Type: Improvement
>Reporter: Xu Cang
>Priority: Major
>
> https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L749



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21553) schedLock not released in MasterProcedureScheduler

2018-12-05 Thread Karan Mehta (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16710977#comment-16710977
 ] 

Karan Mehta commented on HBASE-21553:
-

Good Finding [~xucang]!!

FYI [~sukumaddineni] [~swaroopa]

This is probably the root cause of stuck procedures in the cluster.

> schedLock not released in MasterProcedureScheduler
> --
>
> Key: HBASE-21553
> URL: https://issues.apache.org/jira/browse/HBASE-21553
> Project: HBase
>  Issue Type: Improvement
>Reporter: Xu Cang
>Priority: Major
>
> https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L749
> As shown above, we didn't unlock schedLock which can cause deadlock.
> Besides this, there are other places in this class handles schedLock.unlock 
> in a risky manner. I'd like to move them to finally block to improve the 
> robustness of handling locks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21553) schedLock not released in MasterProcedureScheduler

2018-12-10 Thread Karan Mehta (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16715995#comment-16715995
 ] 

Karan Mehta commented on HBASE-21553:
-

Is this not going into branch-1.3 or branch-1.2?

> schedLock not released in MasterProcedureScheduler
> --
>
> Key: HBASE-21553
> URL: https://issues.apache.org/jira/browse/HBASE-21553
> Project: HBase
>  Issue Type: Improvement
>Reporter: Xu Cang
>Assignee: Xu Cang
>Priority: Major
> Fix For: 1.5.0, 1.4.10
>
> Attachments: HBASE-21553-branch-1.001.patch, 
> HBASE-21553-branch-1.002.patch
>
>
> https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureScheduler.java#L749
> As shown above, we didn't unlock schedLock which can cause deadlock.
> Besides this, there are other places in this class handles schedLock.unlock 
> in a risky manner. I'd like to move them to finally block to improve the 
> robustness of handling locks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-18228) HBCK improvements

2019-04-11 Thread Karan Mehta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-18228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karan Mehta reassigned HBASE-18228:
---

Assignee: (was: Karan Mehta)

> HBCK improvements
> -
>
> Key: HBASE-18228
> URL: https://issues.apache.org/jira/browse/HBASE-18228
> Project: HBase
>  Issue Type: Improvement
>  Components: hbck
>Reporter: Lars Hofhansl
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HBASE-18228.branch-1.3.patch
>
>
> We just had a prod issue and running HBCK the way we did actually causes more 
> problems.
> In part HBCK did stuff we did not expect, in part we had little visibility 
> into what HBCK was doing, and in part the logging was confusing.
> I'm proposing 2 improvements:
> 1. A dry-run mode. Run, and just list what would have been done.
> 2. An interactive mode. Run, and for each action request Y/N user input. So 
> that a user can opt-out of stuff.
> [~jmhsieh], FYI



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)