[ 
https://issues.apache.org/jira/browse/IMPALA-10337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17244392#comment-17244392
 ] 

ASF subversion and git services commented on IMPALA-10337:
----------------------------------------------------------

Commit 2004a87edfa3a78e89623f395e8697047fe3c984 in impala's branch 
refs/heads/master from Riza Suminto
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=2004a87 ]

IMPALA-10337: Consider MAX_ROW_SIZE when computing max reservation

PlanRootSink can fail silently if result spooling is enabled and
maxMemReservationBytes is less than 2 * MAX_ROW_SIZE. This happens
because results are spilled using a SpillableRowBatchQueue which needs 2
buffer (read and write) with at least MAX_ROW_SIZE bytes per buffer.
This patch fixes this by setting a lower bound of 2 * MAX_ROW_SIZE while
computing the min reservation for the PlanRootSink.

Testing:
- Pass exhaustive tests.
- Add e2e TestResultSpoolingMaxReservation.
- Lower MAX_ROW_SIZE on tests where MAX_RESULT_SPOOLING_MEM is set to
  extremely low value. Also verify that PLAN_ROOT_SINK's ReservationLimit
  remain unchanged after lowering the MAX_ROW_SIZE.

Change-Id: Id7138e1e034ea5d1cd15cf8de399690e52a9d726
Reviewed-on: http://gerrit.cloudera.org:8080/16765
Reviewed-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>


> DCHECK hit at SpillableRowBatchQueue when row size exceed max reservation
> -------------------------------------------------------------------------
>
>                 Key: IMPALA-10337
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10337
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 3.4.0
>            Reporter: Riza Suminto
>            Assignee: Riza Suminto
>            Priority: Major
>
> While working on IMPALA-9856, I found that the following DCHECK in 
> SpillableRowBatchQueue::AddBatch consistently hit when result spooling is 
> enabled and row size is larger than resource_profile_.max_reservation, 
> causing impalad to crash.
>  
> [https://github.com/apache/impala/blob/eea617b/be/src/runtime/spillable-row-batch-queue.cc#L97]
> We can reproduce this issue by adding the following query options in
>  query_test/test_insert.py::TestInsertQueries::test_insert_large_string
> {code:java}
>     self.client.set_configuration_option("spool_query_results", "1")
>     self.client.set_configuration_option("max_row_size", "257mb"){code}
> Additionally, setting max_result_spooling_mem to 512MB will increase
>  resource_profile_.max_reservation to fit the large row and avoid this DCHECK.
> Instead of DCHECK, I think impalad should return error status, suggesting 
> that user need to set larger max_result_spooling_mem.
> Another solution is to also consider max_row_size when computing 
> maxMemReservationBytes in PlanRootSink.java.
>  
> [https://github.com/apache/impala/blob/eea617b/fe/src/main/java/org/apache/impala/planner/PlanRootSink.java#L74]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to