Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17054 )

Change subject: IMPALA-10497: Fix flakiness in 
test_no_fd_caching_on_cached_data.
......................................................................


Patch Set 1:

(5 comments)

Thanks for looking into this.

My main thought here is that I think the change may be bigger than it needs to 
be. My theory is that the "Repeat the warm up query 5 times" is the part of the 
change that makes it work and it may not need the other parts. I would be 
interested in whether a stripped down version of this with only that part of 
the change would fix the issue.

http://gerrit.cloudera.org:8080/#/c/17054/1/tests/custom_cluster/test_hdfs_fd_caching.py
File tests/custom_cluster/test_hdfs_fd_caching.py:

http://gerrit.cloudera.org:8080/#/c/17054/1/tests/custom_cluster/test_hdfs_fd_caching.py@179
PS1, Line 179: TestHdfsFdCaching
One thing to watch out for here is if you inherit from TestHdfsFdCaching, you 
may inherit all of its test cases (e.g. test_caching_with_eviction()). I'm not 
sure how pytest deals with that.


http://gerrit.cloudera.org:8080/#/c/17054/1/tests/custom_cluster/test_hdfs_fd_caching.py@184
PS1, Line 184:   NUM_ROWS = 10
I believe that the number of rows should not impact the behavior. The 
differences between 10 and 100 is small enough that it shouldn't impact the IO 
pattern. They are all in a single parquet files, and we would do an IO for the 
footer and one IO for each column.


http://gerrit.cloudera.org:8080/#/c/17054/1/tests/custom_cluster/test_hdfs_fd_caching.py@190
PS1, Line 190: cluster_size=1
I think this should not impact the caching. Our scheduling algorithm maps files 
to nodes consistently, so each node should be reading the same file each time.


http://gerrit.cloudera.org:8080/#/c/17054/1/tests/custom_cluster/test_hdfs_fd_caching.py@207
PS1, Line 207: Read 5 times to make
             :     # sure the data cache is fully warmed up.
             :     for x in range(5):
This makes a lot of sense to me as the explanation. There are times when data 
won't be written to the cache. In particular, there is a limit on concurrency 
in writing to the cache. I'm guessing we could hit that and fail to cache 
something the first time around.


http://gerrit.cloudera.org:8080/#/c/17054/1/tests/custom_cluster/test_hdfs_fd_caching.py@224
PS1, Line 224: for x in range(5):
I'm fine with either value here, but I also don't see this changing the 
behavior of the test.



--
To view, visit http://gerrit.cloudera.org:8080/17054
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I774f9dfea7dcc107c3c7f2b76db3aaf4b2dd7952
Gerrit-Change-Number: 17054
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bikramjeet....@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Comment-Date: Wed, 10 Feb 2021 22:59:41 +0000
Gerrit-HasComments: Yes

Reply via email to