[
https://issues.apache.org/jira/browse/HBASE-14796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070286#comment-15070286
]
Zhan Zhang commented on HBASE-14796:
------------------------------------
Thanks [~ted.m] for the quick review. It is reasonable to have a performance
test, and I will try to grab some physical cluster for it. It may take some
time, as I don't have physical cluster for this.
On the other hand, I do think we should change it to perform BulkGet in
executors regardless the performance (although I think it should improve the
performance instead of the other way), because:
1. Current implementation do gather-scatter in driver, which would increase
network overhead and latency if the number of gets is big.
2. Failure recovery. It is hard to do failure recovery as it is performed in
driver, which is single point of failure.
The above two have been discussed in details. But I just realized there is
another potential issue, which the current implementation may be against Spark
SQL engine design as below.
3. Currently, the bulkGet is happening in the query plan (buildScan), and the
results will stay in driver (1st). The result is distributed to executors in
query execution(2nd).
3.1 1st and 2nd are not always happening in pair. Even worse, sometimes only
1st is happening, for example, users do plan.explain, but may never trigger the
plan execution.
3.2 Memory taken by table.get may never get released in driver, increase the
driver memory overhead.
[~ted.m] Please let me know how do you think, and correct me if my
understanding is wrong.
> Enhance the Gets in the connector
> ---------------------------------
>
> Key: HBASE-14796
> URL: https://issues.apache.org/jira/browse/HBASE-14796
> Project: HBase
> Issue Type: Improvement
> Reporter: Ted Malaska
> Assignee: Zhan Zhang
> Priority: Minor
> Attachments: HBASE-14976.patch
>
>
> Current the Spark-Module Spark SQL implementation gets records from HBase
> from the driver if there is something like the following found in the SQL.
> rowkey = 123
> The reason for this original was normal sql will not have many equal
> operations in a single where clause.
> Zhan, had brought up too points that have value.
> 1. The SQL may be generated and may have many many equal statements in it so
> moving the work to an executor protects the driver from load
> 2. In the correct implementation the drive is connecting to HBase and
> exceptions may cause trouble with the Spark application and not just with the
> a single task execution
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)