Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/13724 )
Change subject: IMPALA-8341: [DOCS] Describe the settings for remote data caching ...................................................................... Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/13724/1/docs/topics/impala_data_cache.xml File docs/topics/impala_data_cache.xml: http://gerrit.cloudera.org:8080/#/c/13724/1/docs/topics/impala_data_cache.xml@44 PS1, Line 44: --data_cache_dir --data_cache_dir and --data_cache_size are options built specifically only for ./bin/start-impala-cluster.py as that script needs to create extra sub-directories for the caching directory. To specify the caching directory, the user should use the flag: --data_cache=<dir1>,<dir2>,<dir3>:<quota> With the above configuration, data will be stored in <dir1>, <dir2> and <dir3> respectively. The user needs to make sure those directories exist in the local filesystem to begin with. In addition, the filesystem which the directory resides in must support hole punching. Modern filesystems such as ext4 and xfs support this feature. The cache may consume up to <quota> bytes for each of the directories specified. In other words, with the above configuration, the total cache size can be up to 3 * <quota>. Please see https://github.com/apache/impala/blob/master/be/src/runtime/io/disk-io-mgr.cc#L58-L63 for the definition of the flag. -- To view, visit http://gerrit.cloudera.org:8080/13724 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7dd958e4de109b46eaf906fe93145799af123b3f Gerrit-Change-Number: 13724 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni <arod...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Michael Ho <k...@cloudera.com> Gerrit-Comment-Date: Tue, 25 Jun 2019 17:50:09 +0000 Gerrit-HasComments: Yes