Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13178 )

Change subject: IMPALA-4658: Potential race if compiler reorders ReachedLimit() 
usage.
......................................................................


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/13178/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13178/4//COMMIT_MSG@9
PS4, Line 9: The ExecNode::num_rows_returned_ member is shared by the scan node
           : (thread) and the related scanner threads. There is a potential 
race here
           : since the compiler is free to reorder the load/store of
           : num_rows_returned_ member. Also, num_rows_returned_ is not 
accessed in
           : a thread safe manner.
> Can this actually cause problems? I think that all scanners read num_rows_r
It's hard to reason about the implications. It would be valid  for the load 
instruction to get pulled out of a loop and potentially have a non-terminating 
loop. I don't think we have code where this would happen now, but who knows - 
the point is more that the pattern is unsafe and you can't reason about the 
behaviour of the functional locally.

It also is going to generate warnings in TSAN, which we'd prefer to avoid.


http://gerrit.cloudera.org:8080/#/c/13178/4/be/src/exec/hdfs-scan-node.cc
File be/src/exec/hdfs-scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/13178/4/be/src/exec/hdfs-scan-node.cc@134
PS4, Line 134:     // Update the number of materialized rows now instead of 
when they are materialized.
             :     // This means that scanners might process and queue up more 
rows than are necessary
             :     // for the limit case but we want to avoid the synchronized 
writes to
             :     // num_rows_returned_.
             :     IncrementNumRowsReturnedShared(row_batch->num_rows());
> The comment + the Shared function seem weird together - the comment says th
Yeah the comment is a little weird now in that it's justifying using the 
standard pattern for num_rows_returned_ instead of an alternative design. The 
alternative design would actually probably violate invariants of ExecNode by 
incrementing num_rows_returned_ before returning the actual nodes. So it 
doesn't make a lot of sense in context.

We could just delete the comment I think. Or something short describing the 
consequences like

  // Note that the scanner threads may have processed and queued up extra rows 
before this thread incremented the rows returned.



--
To view, visit http://gerrit.cloudera.org:8080/13178
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4cbbfad80f7ab87dd6f192a24e2c68f7c66b047e
Gerrit-Change-Number: 13178
Gerrit-PatchSet: 4
Gerrit-Owner: Abhishek Rawat <ara...@cloudera.com>
Gerrit-Reviewer: Abhishek Rawat <ara...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Comment-Date: Thu, 02 May 2019 20:27:39 +0000
Gerrit-HasComments: Yes

Reply via email to