subject:"\[Impala\-ASF\-CR\] IMPALA\-4623\: Enable file handle cache"

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-26 Thread Impala Public Jenkins (Code Review)

Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4623: Enable file handle cache
..


IMPALA-4623: Enable file handle cache

Currently, every scan range maintains a file handle, even
when multiple scan ranges are accessing the same file.
Opening the file handles causes load on the NameNode, which
can lead to scaling issues.

There are two parts to this transaction:
1. Enable file handle caching by default for local files
2. Share the file handle between scan ranges from the same
file

Local scan ranges no longer maintain their own Hdfs file
handles. On each read, the io thread will get the Hdfs file
handle from the cache (opening it if necessary) and use
that for the read. This allows multiple scan ranges on the
same file to use the same file handle. Since the file
offsets are no longer consistent for an individual scan
range, all Hdfs reads need to either use hdfsPread or do
a seek before reading. Additionally, since Hdfs read
statistics are maintained on the file handle, the read
statistics must be retrieved and cleared after each read.

To manage contention, the file handle cache is now
partitioned by a hash of the key into independent
caches with independent locks. The allowed capacity
of the file handle cache is split evenly among the
partitions. File handles are evicted independently
for each partition. The file handle cache maintains
ownership of the file handles at all times, but it
will not evict a file handle that is in use.

If max_cached_file_handles is set to 0 or the the
scan range is accessing data cached by Hdfs or the
scan range is remote, the scan range will get a
file handle from the cache and hold it until the
scan range is closed. This mimics the existing behavior,
except the file handle stays in the cache and is owned
by the cache. Since it is in use, it will not be evicted.

If a file handle in the cache becomes invalid,
it may result in Read() calls failing. Consequently,
if Read() encounters an error using a file handle
from the cache, it will destroy the handle and
retry once with a new file handle. Any subsequent
error is unrelated to the file handle cache and
will be returned.

Tests:
query_test/test_hdfs_fd_caching.py copies the files from
an existing table into a new directory and uses that to
create an external table. It queries the external table,
then uses the hdfs commandline to manipulate the hdfs file
(delete, move, etc). It queries again to make sure we
don't crash. Then, it runs "invalidate metadata". It
checks the row counts before the modification and after
"invalidate metadata", but it does not check the results
in between.
custom_cluster/test_hdfs_fd_caching.py starts up a cluster
with a small file handle cache size. It verifies that a
file handle can be reused (i.e. rerunning a query does
not result in more file handles cached). It also verifies
that the cache capacity is enforced.

Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Reviewed-on: http://gerrit.cloudera.org:8080/6478
Reviewed-by: Dan Hecht 
Tested-by: Impala Public Jenkins
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/runtime/buffered-block-mgr.h
A be/src/runtime/disk-io-mgr-handle-cache.h
A be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr-internal.h
M be/src/runtime/disk-io-mgr-reader-context.cc
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M tests/custom_cluster/test_hdfs_fd_caching.py
M tests/metadata/test_refresh_partition.py
M tests/query_test/test_hdfs_fd_caching.py
13 files changed, 866 insertions(+), 419 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Dan Hecht: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 24
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-26 Thread Impala Public Jenkins (Code Review)

Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4623: Enable file handle cache
..


Patch Set 23: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 23
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-26 Thread Impala Public Jenkins (Code Review)

Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4623: Enable file handle cache
..


Patch Set 23:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/654/

-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 23
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-26 Thread Dan Hecht (Code Review)

Dan Hecht has posted comments on this change.

Change subject: IMPALA-4623: Enable file handle cache
..


Patch Set 23: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6478/23/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

Line 201: // for its partitions.
technically, the alignment decl isn't needed here (struct alignment equals the 
maximum alignment of fields). but what is needed is the overload new operator. 
but the comment is fine.


-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 23
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-26 Thread Joe McDonnell (Code Review)

Hello Marcel Kornacker, Impala Public Jenkins, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6478

to look at the new patch set (#23).

Change subject: IMPALA-4623: Enable file handle cache
..

IMPALA-4623: Enable file handle cache

Currently, every scan range maintains a file handle, even
when multiple scan ranges are accessing the same file.
Opening the file handles causes load on the NameNode, which
can lead to scaling issues.

There are two parts to this transaction:
1. Enable file handle caching by default for local files
2. Share the file handle between scan ranges from the same
file

Local scan ranges no longer maintain their own Hdfs file
handles. On each read, the io thread will get the Hdfs file
handle from the cache (opening it if necessary) and use
that for the read. This allows multiple scan ranges on the
same file to use the same file handle. Since the file
offsets are no longer consistent for an individual scan
range, all Hdfs reads need to either use hdfsPread or do
a seek before reading. Additionally, since Hdfs read
statistics are maintained on the file handle, the read
statistics must be retrieved and cleared after each read.

To manage contention, the file handle cache is now
partitioned by a hash of the key into independent
caches with independent locks. The allowed capacity
of the file handle cache is split evenly among the
partitions. File handles are evicted independently
for each partition. The file handle cache maintains
ownership of the file handles at all times, but it
will not evict a file handle that is in use.

If max_cached_file_handles is set to 0 or the the
scan range is accessing data cached by Hdfs or the
scan range is remote, the scan range will get a
file handle from the cache and hold it until the
scan range is closed. This mimics the existing behavior,
except the file handle stays in the cache and is owned
by the cache. Since it is in use, it will not be evicted.

If a file handle in the cache becomes invalid,
it may result in Read() calls failing. Consequently,
if Read() encounters an error using a file handle
from the cache, it will destroy the handle and
retry once with a new file handle. Any subsequent
error is unrelated to the file handle cache and
will be returned.

Tests:
query_test/test_hdfs_fd_caching.py copies the files from
an existing table into a new directory and uses that to
create an external table. It queries the external table,
then uses the hdfs commandline to manipulate the hdfs file
(delete, move, etc). It queries again to make sure we
don't crash. Then, it runs "invalidate metadata". It
checks the row counts before the modification and after
"invalidate metadata", but it does not check the results
in between.
custom_cluster/test_hdfs_fd_caching.py starts up a cluster
with a small file handle cache size. It verifies that a
file handle can be reused (i.e. rerunning a query does
not result in more file handles cached). It also verifies
that the cache capacity is enforced.

Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/runtime/buffered-block-mgr.h
A be/src/runtime/disk-io-mgr-handle-cache.h
A be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr-internal.h
M be/src/runtime/disk-io-mgr-reader-context.cc
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M tests/custom_cluster/test_hdfs_fd_caching.py
M tests/metadata/test_refresh_partition.py
M tests/query_test/test_hdfs_fd_caching.py
13 files changed, 866 insertions(+), 419 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/6478/23
-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 23
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-26 Thread Joe McDonnell (Code Review)

Joe McDonnell has posted comments on this change.

Change subject: IMPALA-4623: Enable file handle cache
..


Patch Set 22:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6478/22/be/src/runtime/disk-io-mgr-handle-cache.h
File be/src/runtime/disk-io-mgr-handle-cache.h:

Line 155: uint8_t padding[CACHE_LINE_SIZE];
> What happened? The clang-tidy complaints seem to be about other types; does
I misunderstood and thought that changing the alignas(64) on this struct to a 
CacheLineAligned would eliminate the problem, but aligning this in either way 
requires DiskIoMgr to also be aligned.


http://gerrit.cloudera.org:8080/#/c/6478/22/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

Line 199: class DiskIoMgr {
> This is apparently a type that should be cache-aligned.
This cache alignment is required because FileHandleCachePartition is cache 
aligned. To pass clang tidy, cache alignment should apply to both or neither. 
This upload does neither. I'm switching it to both.


-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 22
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-26 Thread Impala Public Jenkins (Code Review)

Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4623: Enable file handle cache
..


Patch Set 21: Verified-1

Build failed: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/653/

-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 21
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-26 Thread Jim Apple (Code Review)

Jim Apple has posted comments on this change.

Change subject: IMPALA-4623: Enable file handle cache
..


Patch Set 22:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6478/22/be/src/runtime/disk-io-mgr-handle-cache.h
File be/src/runtime/disk-io-mgr-handle-cache.h:

Line 155: uint8_t padding[CACHE_LINE_SIZE];
What happened? The clang-tidy complaints seem to be about other types; doesn't 
the alignment trick fix it for this one?


http://gerrit.cloudera.org:8080/#/c/6478/22/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

Line 199: class DiskIoMgr {
This is apparently a type that should be cache-aligned.


-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 22
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-26 Thread Joe McDonnell (Code Review)

Hello Marcel Kornacker, Impala Public Jenkins, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6478

to look at the new patch set (#22).

Change subject: IMPALA-4623: Enable file handle cache
..

IMPALA-4623: Enable file handle cache

Currently, every scan range maintains a file handle, even
when multiple scan ranges are accessing the same file.
Opening the file handles causes load on the NameNode, which
can lead to scaling issues.

There are two parts to this transaction:
1. Enable file handle caching by default for local files
2. Share the file handle between scan ranges from the same
file

Local scan ranges no longer maintain their own Hdfs file
handles. On each read, the io thread will get the Hdfs file
handle from the cache (opening it if necessary) and use
that for the read. This allows multiple scan ranges on the
same file to use the same file handle. Since the file
offsets are no longer consistent for an individual scan
range, all Hdfs reads need to either use hdfsPread or do
a seek before reading. Additionally, since Hdfs read
statistics are maintained on the file handle, the read
statistics must be retrieved and cleared after each read.

To manage contention, the file handle cache is now
partitioned by a hash of the key into independent
caches with independent locks. The allowed capacity
of the file handle cache is split evenly among the
partitions. File handles are evicted independently
for each partition. The file handle cache maintains
ownership of the file handles at all times, but it
will not evict a file handle that is in use.

If max_cached_file_handles is set to 0 or the the
scan range is accessing data cached by Hdfs or the
scan range is remote, the scan range will get a
file handle from the cache and hold it until the
scan range is closed. This mimics the existing behavior,
except the file handle stays in the cache and is owned
by the cache. Since it is in use, it will not be evicted.

If a file handle in the cache becomes invalid,
it may result in Read() calls failing. Consequently,
if Read() encounters an error using a file handle
from the cache, it will destroy the handle and
retry once with a new file handle. Any subsequent
error is unrelated to the file handle cache and
will be returned.

Tests:
query_test/test_hdfs_fd_caching.py copies the files from
an existing table into a new directory and uses that to
create an external table. It queries the external table,
then uses the hdfs commandline to manipulate the hdfs file
(delete, move, etc). It queries again to make sure we
don't crash. Then, it runs "invalidate metadata". It
checks the row counts before the modification and after
"invalidate metadata", but it does not check the results
in between.
custom_cluster/test_hdfs_fd_caching.py starts up a cluster
with a small file handle cache size. It verifies that a
file handle can be reused (i.e. rerunning a query does
not result in more file handles cached). It also verifies
that the cache capacity is enforced.

Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/runtime/buffered-block-mgr.h
A be/src/runtime/disk-io-mgr-handle-cache.h
A be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr-internal.h
M be/src/runtime/disk-io-mgr-reader-context.cc
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M tests/custom_cluster/test_hdfs_fd_caching.py
M tests/metadata/test_refresh_partition.py
M tests/query_test/test_hdfs_fd_caching.py
13 files changed, 866 insertions(+), 418 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/6478/22
-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 22
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-26 Thread Impala Public Jenkins (Code Review)

Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4623: Enable file handle cache
..


Patch Set 21:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/653/

-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 21
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-26 Thread Dan Hecht (Code Review)

Dan Hecht has posted comments on this change.

Change subject: IMPALA-4623: Enable file handle cache
..


Patch Set 21: Code-Review+2

Carry +2.

-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 21
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-26 Thread Joe McDonnell (Code Review)

Joe McDonnell has posted comments on this change.

Change subject: IMPALA-4623: Enable file handle cache
..


Patch Set 20:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6478/20/be/src/runtime/disk-io-mgr-handle-cache.h
File be/src/runtime/disk-io-mgr-handle-cache.h:

Line 155: uint8_t padding[64];
> You can do this without padding by publicly inheriting from CacheLineAligne
Thanks, that is just what I needed.


-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 20
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-26 Thread Joe McDonnell (Code Review)

Hello Marcel Kornacker, Impala Public Jenkins, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6478

to look at the new patch set (#21).

Change subject: IMPALA-4623: Enable file handle cache
..

IMPALA-4623: Enable file handle cache

Currently, every scan range maintains a file handle, even
when multiple scan ranges are accessing the same file.
Opening the file handles causes load on the NameNode, which
can lead to scaling issues.

There are two parts to this transaction:
1. Enable file handle caching by default for local files
2. Share the file handle between scan ranges from the same
file

Local scan ranges no longer maintain their own Hdfs file
handles. On each read, the io thread will get the Hdfs file
handle from the cache (opening it if necessary) and use
that for the read. This allows multiple scan ranges on the
same file to use the same file handle. Since the file
offsets are no longer consistent for an individual scan
range, all Hdfs reads need to either use hdfsPread or do
a seek before reading. Additionally, since Hdfs read
statistics are maintained on the file handle, the read
statistics must be retrieved and cleared after each read.

To manage contention, the file handle cache is now
partitioned by a hash of the key into independent
caches with independent locks. The allowed capacity
of the file handle cache is split evenly among the
partitions. File handles are evicted independently
for each partition. The file handle cache maintains
ownership of the file handles at all times, but it
will not evict a file handle that is in use.

If max_cached_file_handles is set to 0 or the the
scan range is accessing data cached by Hdfs or the
scan range is remote, the scan range will get a
file handle from the cache and hold it until the
scan range is closed. This mimics the existing behavior,
except the file handle stays in the cache and is owned
by the cache. Since it is in use, it will not be evicted.

If a file handle in the cache becomes invalid,
it may result in Read() calls failing. Consequently,
if Read() encounters an error using a file handle
from the cache, it will destroy the handle and
retry once with a new file handle. Any subsequent
error is unrelated to the file handle cache and
will be returned.

Tests:
query_test/test_hdfs_fd_caching.py copies the files from
an existing table into a new directory and uses that to
create an external table. It queries the external table,
then uses the hdfs commandline to manipulate the hdfs file
(delete, move, etc). It queries again to make sure we
don't crash. Then, it runs "invalidate metadata". It
checks the row counts before the modification and after
"invalidate metadata", but it does not check the results
in between.
custom_cluster/test_hdfs_fd_caching.py starts up a cluster
with a small file handle cache size. It verifies that a
file handle can be reused (i.e. rerunning a query does
not result in more file handles cached). It also verifies
that the cache capacity is enforced.

Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/runtime/buffered-block-mgr.h
A be/src/runtime/disk-io-mgr-handle-cache.h
A be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr-internal.h
M be/src/runtime/disk-io-mgr-reader-context.cc
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M tests/custom_cluster/test_hdfs_fd_caching.py
M tests/metadata/test_refresh_partition.py
M tests/query_test/test_hdfs_fd_caching.py
13 files changed, 862 insertions(+), 418 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/6478/21
-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 21
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-26 Thread Jim Apple (Code Review)

Jim Apple has posted comments on this change.

Change subject: IMPALA-4623: Enable file handle cache
..


Patch Set 20:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6478/20/be/src/runtime/disk-io-mgr-handle-cache.h
File be/src/runtime/disk-io-mgr-handle-cache.h:

Line 155: uint8_t padding[64];
You can do this without padding by publicly inheriting from CacheLineAligned 
from util/aligned-new.h.


-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 20
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-26 Thread Joe McDonnell (Code Review)

Hello Marcel Kornacker, Impala Public Jenkins, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6478

to look at the new patch set (#20).

Change subject: IMPALA-4623: Enable file handle cache
..

IMPALA-4623: Enable file handle cache

Currently, every scan range maintains a file handle, even
when multiple scan ranges are accessing the same file.
Opening the file handles causes load on the NameNode, which
can lead to scaling issues.

There are two parts to this transaction:
1. Enable file handle caching by default for local files
2. Share the file handle between scan ranges from the same
file

Local scan ranges no longer maintain their own Hdfs file
handles. On each read, the io thread will get the Hdfs file
handle from the cache (opening it if necessary) and use
that for the read. This allows multiple scan ranges on the
same file to use the same file handle. Since the file
offsets are no longer consistent for an individual scan
range, all Hdfs reads need to either use hdfsPread or do
a seek before reading. Additionally, since Hdfs read
statistics are maintained on the file handle, the read
statistics must be retrieved and cleared after each read.

To manage contention, the file handle cache is now
partitioned by a hash of the key into independent
caches with independent locks. The allowed capacity
of the file handle cache is split evenly among the
partitions. File handles are evicted independently
for each partition. The file handle cache maintains
ownership of the file handles at all times, but it
will not evict a file handle that is in use.

If max_cached_file_handles is set to 0 or the the
scan range is accessing data cached by Hdfs or the
scan range is remote, the scan range will get a
file handle from the cache and hold it until the
scan range is closed. This mimics the existing behavior,
except the file handle stays in the cache and is owned
by the cache. Since it is in use, it will not be evicted.

If a file handle in the cache becomes invalid,
it may result in Read() calls failing. Consequently,
if Read() encounters an error using a file handle
from the cache, it will destroy the handle and
retry once with a new file handle. Any subsequent
error is unrelated to the file handle cache and
will be returned.

Tests:
query_test/test_hdfs_fd_caching.py copies the files from
an existing table into a new directory and uses that to
create an external table. It queries the external table,
then uses the hdfs commandline to manipulate the hdfs file
(delete, move, etc). It queries again to make sure we
don't crash. Then, it runs "invalidate metadata". It
checks the row counts before the modification and after
"invalidate metadata", but it does not check the results
in between.
custom_cluster/test_hdfs_fd_caching.py starts up a cluster
with a small file handle cache size. It verifies that a
file handle can be reused (i.e. rerunning a query does
not result in more file handles cached). It also verifies
that the cache capacity is enforced.

Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/runtime/buffered-block-mgr.h
A be/src/runtime/disk-io-mgr-handle-cache.h
A be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr-internal.h
M be/src/runtime/disk-io-mgr-reader-context.cc
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M tests/custom_cluster/test_hdfs_fd_caching.py
M tests/metadata/test_refresh_partition.py
M tests/query_test/test_hdfs_fd_caching.py
13 files changed, 866 insertions(+), 418 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/6478/20
-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 20
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-25 Thread Impala Public Jenkins (Code Review)

Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4623: Enable file handle cache
..


Patch Set 19: Verified-1

Build failed: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/648/

-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 19
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-25 Thread Impala Public Jenkins (Code Review)

Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4623: Enable file handle cache
..


Patch Set 19:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/648/

-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 19
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-25 Thread Dan Hecht (Code Review)

Dan Hecht has posted comments on this change.

Change subject: IMPALA-4623: Enable file handle cache
..


Patch Set 19: Code-Review+2

Carry +2.

-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 19
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-25 Thread Joe McDonnell (Code Review)

Hello Marcel Kornacker,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6478

to look at the new patch set (#19).

Change subject: IMPALA-4623: Enable file handle cache
..

IMPALA-4623: Enable file handle cache

Currently, every scan range maintains a file handle, even
when multiple scan ranges are accessing the same file.
Opening the file handles causes load on the NameNode, which
can lead to scaling issues.

There are two parts to this transaction:
1. Enable file handle caching by default for local files
2. Share the file handle between scan ranges from the same
file

Local scan ranges no longer maintain their own Hdfs file
handles. On each read, the io thread will get the Hdfs file
handle from the cache (opening it if necessary) and use
that for the read. This allows multiple scan ranges on the
same file to use the same file handle. Since the file
offsets are no longer consistent for an individual scan
range, all Hdfs reads need to either use hdfsPread or do
a seek before reading. Additionally, since Hdfs read
statistics are maintained on the file handle, the read
statistics must be retrieved and cleared after each read.

To manage contention, the file handle cache is now
partitioned by a hash of the key into independent
caches with independent locks. The allowed capacity
of the file handle cache is split evenly among the
partitions. File handles are evicted independently
for each partition. The file handle cache maintains
ownership of the file handles at all times, but it
will not evict a file handle that is in use.

If max_cached_file_handles is set to 0 or the the
scan range is accessing data cached by Hdfs or the
scan range is remote, the scan range will get a
file handle from the cache and hold it until the
scan range is closed. This mimics the existing behavior,
except the file handle stays in the cache and is owned
by the cache. Since it is in use, it will not be evicted.

If a file handle in the cache becomes invalid,
it may result in Read() calls failing. Consequently,
if Read() encounters an error using a file handle
from the cache, it will destroy the handle and
retry once with a new file handle. Any subsequent
error is unrelated to the file handle cache and
will be returned.

Tests:
query_test/test_hdfs_fd_caching.py copies the files from
an existing table into a new directory and uses that to
create an external table. It queries the external table,
then uses the hdfs commandline to manipulate the hdfs file
(delete, move, etc). It queries again to make sure we
don't crash. Then, it runs "invalidate metadata". It
checks the row counts before the modification and after
"invalidate metadata", but it does not check the results
in between.
custom_cluster/test_hdfs_fd_caching.py starts up a cluster
with a small file handle cache size. It verifies that a
file handle can be reused (i.e. rerunning a query does
not result in more file handles cached). It also verifies
that the cache capacity is enforced.

Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/runtime/buffered-block-mgr.h
A be/src/runtime/disk-io-mgr-handle-cache.h
A be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr-internal.h
M be/src/runtime/disk-io-mgr-reader-context.cc
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M tests/custom_cluster/test_hdfs_fd_caching.py
M tests/metadata/test_refresh_partition.py
M tests/query_test/test_hdfs_fd_caching.py
13 files changed, 861 insertions(+), 418 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/6478/19
-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 19
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-25 Thread Joe McDonnell (Code Review)

Hello Marcel Kornacker,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6478

to look at the new patch set (#18).

Change subject: IMPALA-4623: Enable file handle cache
..

IMPALA-4623: Enable file handle cache

Currently, every scan range maintains a file handle, even
when multiple scan ranges are accessing the same file.
Opening the file handles causes load on the NameNode, which
can lead to scaling issues.

There are two parts to this transaction:
1. Enable file handle caching by default for local files
2. Share the file handle between scan ranges from the same
file

Local scan ranges no longer maintain their own Hdfs file
handles. On each read, the io thread will get the Hdfs file
handle from the cache (opening it if necessary) and use
that for the read. This allows multiple scan ranges on the
same file to use the same file handle. Since the file
offsets are no longer consistent for an individual scan
range, all Hdfs reads need to either use hdfsPread or do
a seek before reading. Additionally, since Hdfs read
statistics are maintained on the file handle, the read
statistics must be retrieved and cleared after each read.

To manage contention, the file handle cache is now
partitioned by a hash of the key into independent
caches with independent locks. The allowed capacity
of the file handle cache is split evenly among the
partitions. File handles are evicted independently
for each partition. The file handle cache maintains
ownership of the file handles at all times, but it
will not evict a file handle that is in use.

If max_cached_file_handles is set to 0 or the the
scan range is accessing data cached by Hdfs or the
scan range is remote, the scan range will get a
file handle from the cache and hold it until the
scan range is closed. This mimics the existing behavior,
except the file handle stays in the cache and is owned
by the cache. Since it is in use, it will not be evicted.

If a file handle in the cache becomes invalid,
it may result in Read() calls failing. Consequently,
if Read() encounters an error using a file handle
from the cache, it will destroy the handle and
retry once with a new file handle. Any subsequent
error is unrelated to the file handle cache and
will be returned.

Tests:
query_test/test_hdfs_fd_caching.py copies the files from
an existing table into a new directory and uses that to
create an external table. It queries the external table,
then uses the hdfs commandline to manipulate the hdfs file
(delete, move, etc). It queries again to make sure we
don't crash. Then, it runs "invalidate metadata". It
checks the row counts before the modification and after
"invalidate metadata", but it does not check the results
in between.
custom_cluster/test_hdfs_fd_caching.py starts up a cluster
with a small file handle cache size. It verifies that a
file handle can be reused (i.e. rerunning a query does
not result in more file handles cached). It also verifies
that the cache capacity is enforced.

Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/runtime/buffered-block-mgr.h
A be/src/runtime/disk-io-mgr-handle-cache.h
A be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr-internal.h
M be/src/runtime/disk-io-mgr-reader-context.cc
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M tests/custom_cluster/test_hdfs_fd_caching.py
M tests/metadata/test_refresh_partition.py
M tests/query_test/test_hdfs_fd_caching.py
13 files changed, 861 insertions(+), 418 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/6478/18
-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 18
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-25 Thread Joe McDonnell (Code Review)

Joe McDonnell has posted comments on this change.

Change subject: IMPALA-4623: Enable file handle cache
..


Patch Set 17:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6478/17/be/src/runtime/disk-io-mgr-scan-range.cc
File be/src/runtime/disk-io-mgr-scan-range.cc:

Line 288: if (use_file_handle_cache && expected_local_) return Status::OK();
> you need to document that behavior in the scanrange declaration
Updated ScanRange::Open documentation and the comment for exclusive_hdfs_fh_.


http://gerrit.cloudera.org:8080/#/c/6478/17/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

Line 453: /// file handle from the file handle cache to hold for its 
exclusive use.
> also document the behavior for non-local scan ranges
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 17
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-25 Thread Marcel Kornacker (Code Review)

Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4623: Enable file handle cache
..


Patch Set 17:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6478/17/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

Line 453: /// file handle from the file handle cache to hold for its 
exclusive use.
also document the behavior for non-local scan ranges


-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 17
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-25 Thread Marcel Kornacker (Code Review)

Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4623: Enable file handle cache
..


Patch Set 17: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6478/17/be/src/runtime/disk-io-mgr-scan-range.cc
File be/src/runtime/disk-io-mgr-scan-range.cc:

Line 288: if (use_file_handle_cache && expected_local_) return Status::OK();
you need to document that behavior in the scanrange declaration


-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 17
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-25 Thread Joe McDonnell (Code Review)

Joe McDonnell has posted comments on this change.

Change subject: IMPALA-4623: Enable file handle cache
..


Patch Set 16:

This upload disables the caching for remote files. This also rebases to the 
latest (which involved minor merges on the tests due to the ADLS change).

-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 16
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-25 Thread Joe McDonnell (Code Review)

Joe McDonnell has uploaded a new patch set (#17).

Change subject: IMPALA-4623: Enable file handle cache
..

IMPALA-4623: Enable file handle cache

Currently, every scan range maintains a file handle, even
when multiple scan ranges are accessing the same file.
Opening the file handles causes load on the NameNode, which
can lead to scaling issues.

There are two parts to this transaction:
1. Enable file handle caching by default for local files
2. Share the file handle between scan ranges from the same
file

Local scan ranges no longer maintain their own Hdfs file
handles. On each read, the io thread will get the Hdfs file
handle from the cache (opening it if necessary) and use
that for the read. This allows multiple scan ranges on the
same file to use the same file handle. Since the file
offsets are no longer consistent for an individual scan
range, all Hdfs reads need to either use hdfsPread or do
a seek before reading. Additionally, since Hdfs read
statistics are maintained on the file handle, the read
statistics must be retrieved and cleared after each read.

To manage contention, the file handle cache is now
partitioned by a hash of the key into independent
caches with independent locks. The allowed capacity
of the file handle cache is split evenly among the
partitions. File handles are evicted independently
for each partition. The file handle cache maintains
ownership of the file handles at all times, but it
will not evict a file handle that is in use.

If max_cached_file_handles is set to 0 or the the
scan range is accessing data cached by Hdfs or the
scan range is remote, the scan range will get a
file handle from the cache and hold it until the
scan range is closed. This mimics the existing behavior,
except the file handle stays in the cache and is owned
by the cache. Since it is in use, it will not be evicted.

If a file handle in the cache becomes invalid,
it may result in Read() calls failing. Consequently,
if Read() encounters an error using a file handle
from the cache, it will destroy the handle and
retry once with a new file handle. Any subsequent
error is unrelated to the file handle cache and
will be returned.

Tests:
query_test/test_hdfs_fd_caching.py copies the files from
an existing table into a new directory and uses that to
create an external table. It queries the external table,
then uses the hdfs commandline to manipulate the hdfs file
(delete, move, etc). It queries again to make sure we
don't crash. Then, it runs "invalidate metadata". It
checks the row counts before the modification and after
"invalidate metadata", but it does not check the results
in between.
custom_cluster/test_hdfs_fd_caching.py starts up a cluster
with a small file handle cache size. It verifies that a
file handle can be reused (i.e. rerunning a query does
not result in more file handles cached). It also verifies
that the cache capacity is enforced.

Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/runtime/buffered-block-mgr.h
A be/src/runtime/disk-io-mgr-handle-cache.h
A be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr-internal.h
M be/src/runtime/disk-io-mgr-reader-context.cc
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M tests/custom_cluster/test_hdfs_fd_caching.py
M tests/metadata/test_refresh_partition.py
M tests/query_test/test_hdfs_fd_caching.py
13 files changed, 857 insertions(+), 418 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/6478/17
-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 17
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-25 Thread Joe McDonnell (Code Review)

Joe McDonnell has posted comments on this change.

Change subject: IMPALA-4623: Enable file handle cache
..


Patch Set 15:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/6478/15/be/src/runtime/disk-io-mgr-scan-range.cc
File be/src/runtime/disk-io-mgr-scan-range.cc:

Line 283: if (exclusive_hdfs_fh_ != nullptr) return Status::OK();
> when will this be true?
The IO thread is calling ReadRange on each ScanRange. ReadRange calls Open 
whether or not that ScanRange was already open. This check short-circuits in 
the case that the ScanRange was already open.


Line 351:   VLOG_FILE << "Cache HDFS file handle file=" << file();
> this is not a meaningful log message, let's remove it
Removed.


Line 420: HdfsFileHandle* hdfs_file_from_cache = nullptr;
> the name is somewhat misleading, because exclusive_hdfs_fh_ is still from t
Changed to "borrowed_hdfs_fh"


Line 484: // - or if the file handle is not from the cache
> but exclusive_hdfs_fh_ is also from the cache, why wouldn't we retry in tha
Changed to do retry for exclusive file handles.


http://gerrit.cloudera.org:8080/#/c/6478/15/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

Line 1271:   bool cache_hit;
> rename to dummy or something like that
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 15
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-25 Thread Joe McDonnell (Code Review)

Joe McDonnell has uploaded a new patch set (#16).

Change subject: IMPALA-4623: Enable file handle cache
..

IMPALA-4623: Enable file handle cache

Currently, every scan range maintains a file handle, even
when multiple scan ranges are accessing the same file.
Opening the file handles causes load on the NameNode, which
can lead to scaling issues.

There are two parts to this transaction:
1. Enable file handle caching by default
2. Share the file handle between scan ranges from the same
file

The scan range no longer maintains its own Hdfs file
handle. On each read, the io thread will get the Hdfs file
handle from the cache (opening it if necessary) and use
that for the read. This allows multiple scan ranges on the
same file to use the same file handle. Since the file
offsets are no longer consistent for an individual scan
range, all Hdfs reads need to either use hdfsPread or do
a seek before reading. Additionally, since Hdfs read
statistics are maintained on the file handle, the read
statistics must be retrieved and cleared after each read.

To manage contention, the file handle cache is now
partitioned by a hash of the key into independent
caches with independent locks. The allowed capacity
of the file handle cache is split evenly among the
partitions. File handles are evicted independently
for each partition. The file handle cache maintains
ownership of the file handles at all times, but it
will not evict a file handle that is in use.

If max_cached_file_handles is set to 0 or the the
scan range is accessing data cached by Hdfs, the
scan range will get a file handle from the cache
and hold it until the scan range is closed. This
mimics the existing behavior, except the file
handle stays in the cache and is owned by the
cache. Since it is in use, it will not be evicted.

If a file handle in the cache becomes invalid,
it may result in Read() calls failing. Consequently,
if Read() encounters an error using a file handle
from the cache, it will destroy the handle and
retry once with a new file handle. Any subsequent
error is unrelated to the file handle cache and
will be returned.

Tests:
query_test/test_hdfs_fd_caching.py copies the files from
an existing table into a new directory and uses that to
create an external table. It queries the external table,
then uses the hdfs commandline to manipulate the hdfs file
(delete, move, etc). It queries again to make sure we
don't crash. Then, it runs "invalidate metadata". It
checks the row counts before the modification and after
"invalidate metadata", but it does not check the results
in between.
custom_cluster/test_hdfs_fd_caching.py starts up a cluster
with a small file handle cache size. It verifies that a
file handle can be reused (i.e. rerunning a query does
not result in more file handles cached). It also verifies
that the cache capacity is enforced.

Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/runtime/buffered-block-mgr.h
A be/src/runtime/disk-io-mgr-handle-cache.h
A be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr-internal.h
M be/src/runtime/disk-io-mgr-reader-context.cc
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M tests/custom_cluster/test_hdfs_fd_caching.py
M tests/metadata/test_refresh_partition.py
M tests/query_test/test_hdfs_fd_caching.py
13 files changed, 849 insertions(+), 418 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/6478/16
-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 16
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-25 Thread Marcel Kornacker (Code Review)

Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4623: Enable file handle cache
..


Patch Set 15:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/6478/15/be/src/runtime/disk-io-mgr-scan-range.cc
File be/src/runtime/disk-io-mgr-scan-range.cc:

Line 283: if (exclusive_hdfs_fh_ != nullptr) return Status::OK();
when will this be true?


Line 351:   VLOG_FILE << "Cache HDFS file handle file=" << file();
this is not a meaningful log message, let's remove it


Line 420: HdfsFileHandle* hdfs_file_from_cache = nullptr;
the name is somewhat misleading, because exclusive_hdfs_fh_ is still from the 
cache


Line 484: // - or if the file handle is not from the cache
but exclusive_hdfs_fh_ is also from the cache, why wouldn't we retry in that 
case (and reset exclusive_hdfs_fh_)?


http://gerrit.cloudera.org:8080/#/c/6478/15/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

Line 1271:   bool cache_hit;
rename to dummy or something like that


-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 15
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-23 Thread Joe McDonnell (Code Review)

Joe McDonnell has uploaded a new patch set (#12).

Change subject: IMPALA-4623: Enable file handle cache
..

IMPALA-4623: Enable file handle cache

Currently, every scan range maintains a file handle, even
when multiple scan ranges are accessing the same file.
Opening the file handles causes load on the NameNode, which
can lead to scaling issues.

There are two parts to this transaction:
1. Enable file handle caching by default
2. Share the file handle between scan ranges from the same
file

The scan range no longer maintains its own Hdfs file
handle. On each read, the io thread will get the Hdfs file
handle from the cache (opening it if necessary) and use
that for the read. This allows multiple scan ranges on the
same file to use the same file handle. Since the file
offsets are no longer consistent for an individual scan
range, all Hdfs reads need to either use hdfsPread or do
a seek before reading. Additionally, since Hdfs read
statistics are maintained on the file handle, the read
statistics must be retrieved and cleared after each read.

To manage contention, the file handle cache is now
partitioned by a hash of the key into independent
caches with independent locks. The allowed capacity
of the file handle cache is split evenly among the
partitions. File handles are evicted independently
for each partition. The file handle cache maintains
ownership of the file handles at all times, but it
will not evict a file handle that is in use.

If max_cached_file_handles is set to 0 or the the
scan range is accessing data cached by Hdfs, the
scan range will get a file handle from the cache
and hold it until the scan range is closed. This
mimics the existing behavior, except the file
handle stays in the cache and is owned by the
cache. Since it is in use, it will not be evicted.

If a file handle in the cache becomes invalid,
it may result in Read() calls failing. Consequently,
if Read() encounters an error using a file handle
from the cache, it will destroy the handle and
retry once with a new file handle. Any subsequent
error is unrelated to the file handle cache and
will be returned.

Tests:
query_test/test_hdfs_fd_caching.py copies the files from
an existing table into a new directory and uses that to
create an external table. It queries the external table,
then uses the hdfs commandline to manipulate the hdfs file
(delete, move, etc). It queries again to make sure we
don't crash. Then, it runs "invalidate metadata". It
checks the row counts before the modification and after
"invalidate metadata", but it does not check the results
in between.
custom_cluster/test_hdfs_fd_caching.py starts up a cluster
with a small file handle cache size. It verifies that a
file handle can be reused (i.e. rerunning a query does
not result in more file handles cached). It also verifies
that the cache capacity is enforced.

Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/runtime/buffered-block-mgr.h
A be/src/runtime/disk-io-mgr-handle-cache.h
A be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr-internal.h
M be/src/runtime/disk-io-mgr-reader-context.cc
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M tests/custom_cluster/test_hdfs_fd_caching.py
M tests/metadata/test_refresh_partition.py
M tests/query_test/test_hdfs_fd_caching.py
13 files changed, 850 insertions(+), 418 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/6478/12
-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-23 Thread Joe McDonnell (Code Review)

Joe McDonnell has posted comments on this change.

Change subject: IMPALA-4623: Enable file handle cache
..


Patch Set 11:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/6478/11/be/src/runtime/disk-io-mgr-scan-range.cc
File be/src/runtime/disk-io-mgr-scan-range.cc:

Line 445:   // bad file handle; record the error, but allow for a retry 
to fix it.
> wouldn't the same apply to the pread path?
Added retry for hdfsSeek in Open() and hadoopReadZero in ReadFromCache. Since 
there are multiple locations for retry, I created a separate function that 
reopens the file handle.

Changed the hdfsPread/hdfsRead paths to fill in status inside the loop.

Moved this comment up top and made it apply to all the operations.


Line 467: if (current_bytes_read != -1 || num_retries >= 1 ||
> num_retries == 1 and dcheck that it's not >1 somewhere?
Done


Line 468: (hdfs_file_from_cache == nullptr)) {
> remove ()?
Done


Line 471: // If the file handle is from the cache, it might have become 
invalid. Destroy
> rephrase, we know it's from the cache
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-22 Thread Marcel Kornacker (Code Review)

Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4623: Enable file handle cache
..


Patch Set 11:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/6478/11/be/src/runtime/disk-io-mgr-scan-range.cc
File be/src/runtime/disk-io-mgr-scan-range.cc:

Line 445:   // bad file handle; record the error, but allow for a retry 
to fix it.
wouldn't the same apply to the pread path?

also, wouldn't the same apply to l293?


Line 467: if (current_bytes_read != -1 || num_retries >= 1 ||
num_retries == 1 and dcheck that it's not >1 somewhere?


Line 468: (hdfs_file_from_cache == nullptr)) {
remove ()?


Line 471: // If the file handle is from the cache, it might have become 
invalid. Destroy
rephrase, we know it's from the cache


-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-22 Thread Joe McDonnell (Code Review)

Joe McDonnell has uploaded a new patch set (#11).

Change subject: IMPALA-4623: Enable file handle cache
..

IMPALA-4623: Enable file handle cache

Currently, every scan range maintains a file handle, even
when multiple scan ranges are accessing the same file.
Opening the file handles causes load on the NameNode, which
can lead to scaling issues.

There are two parts to this transaction:
1. Enable file handle caching by default
2. Share the file handle between scan ranges from the same
file

The scan range no longer maintains its own Hdfs file
handle. On each read, the io thread will get the Hdfs file
handle from the cache (opening it if necessary) and use
that for the read. This allows multiple scan ranges on the
same file to use the same file handle. Since the file
offsets are no longer consistent for an individual scan
range, all Hdfs reads need to either use hdfsPread or do
a seek before reading. Additionally, since Hdfs read
statistics are maintained on the file handle, the read
statistics must be retrieved and cleared after each read.

To manage contention, the file handle cache is now
partitioned by a hash of the key into independent
caches with independent locks. The allowed capacity
of the file handle cache is split evenly among the
partitions. File handles are evicted independently
for each partition. The file handle cache maintains
ownership of the file handles at all times, but it
will not evict a file handle that is in use.

If max_cached_file_handles is set to 0 or the the
scan range is accessing data cached by Hdfs, the
scan range will get a file handle from the cache
and hold it until the scan range is closed. This
mimics the existing behavior, except the file
handle stays in the cache and is owned by the
cache. Since it is in use, it will not be evicted.

If a file handle in the cache becomes invalid,
it may result in Read() calls failing. Consequently,
if Read() encounters an error using a file handle
from the cache, it will destroy the handle and
retry once with a new file handle. Any subsequent
error is unrelated to the file handle cache and
will be returned.

Tests:
query_test/test_hdfs_fd_caching.py copies the files from
an existing table into a new directory and uses that to
create an external table. It queries the external table,
then uses the hdfs commandline to manipulate the hdfs file
(delete, move, etc). It queries again to make sure we
don't crash. Then, it runs "invalidate metadata". It
checks the row counts before the modification and after
"invalidate metadata", but it does not check the results
in between.
custom_cluster/test_hdfs_fd_caching.py starts up a cluster
with a small file handle cache size. It verifies that a
file handle can be reused (i.e. rerunning a query does
not result in more file handles cached). It also verifies
that the cache capacity is enforced.

Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/runtime/buffered-block-mgr.h
A be/src/runtime/disk-io-mgr-handle-cache.h
A be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr-internal.h
M be/src/runtime/disk-io-mgr-reader-context.cc
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M tests/custom_cluster/test_hdfs_fd_caching.py
M tests/metadata/test_refresh_partition.py
M tests/query_test/test_hdfs_fd_caching.py
13 files changed, 815 insertions(+), 410 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/6478/11
-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-22 Thread Joe McDonnell (Code Review)

Joe McDonnell has posted comments on this change.

Change subject: IMPALA-4623: Enable file handle cache
..


Patch Set 10:

(3 comments)

About to post another upload.

http://gerrit.cloudera.org:8080/#/c/6478/8/be/src/runtime/disk-io-mgr-handle-cache.h
File be/src/runtime/disk-io-mgr-handle-cache.h:

Line 88:   /// is set to true. Otherwise, the partition will try to construct a 
file handle and
> true, but you're describing a scenario with concurrent queries and updates.
That makes sense. We don't provide guarantees, and it seems like an uncommon 
case. I will include this on the JIRA I file for the time based eviction.


http://gerrit.cloudera.org:8080/#/c/6478/8/be/src/runtime/disk-io-mgr-scan-range.cc
File be/src/runtime/disk-io-mgr-scan-range.cc:

Line 417: } else {
> so it's just num_bytes_read then. when i see 'last' or 'prev' or something 
That makes sense. I think this came about trying to avoid a naming collision 
with the function's argument "bytes_read". Since we already have bytes_read and 
bytes_read_, I switched this to current_bytes_read rather than num_bytes_read.


http://gerrit.cloudera.org:8080/#/c/6478/10/be/src/runtime/disk-io-mgr-scan-range.cc
File be/src/runtime/disk-io-mgr-scan-range.cc:

Line 352:   int success = hdfsGetHedgedReadMetrics(fs_, &hedged_metrics);
> you're getting metrics for the whole fs_, but every scan range does that on
This came in due to a rebase. IMPALA-4764 added this. I had to move it around 
because the code changed in Close(). It is setting the metric to this value 
rather than adding it.


-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-21 Thread Marcel Kornacker (Code Review)

Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4623: Enable file handle cache
..

Patch Set 10:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/6478/8/be/src/runtime/disk-io-mgr-handle-cache.h
File be/src/runtime/disk-io-mgr-handle-cache.h:

Line 88: /// is set to true. Otherwise, the partition will try to construct a
file handle and
> Added a TODO for this. I plan on dealing with this when doing eviction by t
true, but you're describing a scenario with concurrent queries and updates. we
don't make any guarantees in that case. it would be fine for step 6 to notice
that the mtime changed when it tries to reopen that file, and return an error
at that point.

http://gerrit.cloudera.org:8080/#/c/6478/10/be/src/runtime/disk-io-mgr-handle-cache.inline.h
File be/src/runtime/disk-io-mgr-handle-cache.inline.h:

Line 130: // return code, and we close the file handle and remove it from the
cache.
thanks for the explanation

http://gerrit.cloudera.org:8080/#/c/6478/8/be/src/runtime/disk-io-mgr-scan-range.cc
File be/src/runtime/disk-io-mgr-scan-range.cc:

Line 417: } else {
> This is the number of bytes read by either hdfsPread or hdfsRead. In case o
so it's just num_bytes_read then. when i see 'last' or 'prev' or something like
that in a loop i usually expect this value gets set in iteration n and used in
n+1.

http://gerrit.cloudera.org:8080/#/c/6478/10/be/src/runtime/disk-io-mgr-scan-range.cc
File be/src/runtime/disk-io-mgr-scan-range.cc:

Line 352: int success = hdfsGetHedgedReadMetrics(fs_, &hedged_metrics);
you're getting metrics for the whole fs_, but every scan range does that on its
own, and then updates a global counter? how is that not double-counting?

--
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell
Gerrit-Reviewer: Dan Hecht
Gerrit-Reviewer: Joe McDonnell
Gerrit-Reviewer: Marcel Kornacker
Gerrit-Reviewer: Tim Armstrong
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-19 Thread Joe McDonnell (Code Review)

Joe McDonnell has uploaded a new patch set (#10).

Change subject: IMPALA-4623: Enable file handle cache
..

IMPALA-4623: Enable file handle cache

Currently, every scan range maintains a file handle, even
when multiple scan ranges are accessing the same file.
Opening the file handles causes load on the NameNode, which
can lead to scaling issues.

There are two parts to this transaction:
1. Enable file handle caching by default
2. Share the file handle between scan ranges from the same
file

The scan range no longer maintains its own Hdfs file
handle. On each read, the io thread will get the Hdfs file
handle from the cache (opening it if necessary) and use
that for the read. This allows multiple scan ranges on the
same file to use the same file handle. Since the file
offsets are no longer consistent for an individual scan
range, all Hdfs reads need to either use hdfsPread or do
a seek before reading. Additionally, since Hdfs read
statistics are maintained on the file handle, the read
statistics must be retrieved and cleared after each read.

To manage contention, the file handle cache is now
partitioned by a hash of the key into independent
caches with independent locks. The allowed capacity
of the file handle cache is split evenly among the
partitions. File handles are evicted independently
for each partition. The file handle cache maintains
ownership of the file handles at all times, but it
will not evict a file handle that is in use.

If max_cached_file_handles is set to 0 or the the
scan range is accessing data cached by Hdfs, the
scan range will get a file handle from the cache
and hold it until the scan range is closed. This
mimics the existing behavior, except the file
handle stays in the cache and is owned by the
cache. Since it is in use, it will not be evicted.

Tests:
query_test/test_hdfs_fd_caching.py copies the files from
an existing table into a new directory and uses that to
create an external table. It queries the external table,
then uses the hdfs commandline to manipulate the hdfs file
(delete, move, etc). It queries again to make sure we
don't crash. Then, it runs "invalidate metadata". It
checks the row counts before the modification and after
"invalidate metadata", but it does not check the results
in between.
custom_cluster/test_hdfs_fd_caching.py starts up a cluster
with a small file handle cache size. It verifies that a
file handle can be reused (i.e. rerunning a query does
not result in more file handles cached). It also verifies
that the cache capacity is enforced.

Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/runtime/buffered-block-mgr.h
A be/src/runtime/disk-io-mgr-handle-cache.h
A be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr-internal.h
M be/src/runtime/disk-io-mgr-reader-context.cc
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M tests/custom_cluster/test_hdfs_fd_caching.py
M tests/metadata/test_refresh_partition.py
M tests/query_test/test_hdfs_fd_caching.py
13 files changed, 769 insertions(+), 410 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/6478/10
-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-19 Thread Joe McDonnell (Code Review)

Joe McDonnell has uploaded a new patch set (#9).

Change subject: IMPALA-4623: Enable file handle cache
..

IMPALA-4623: Enable file handle cache

Currently, every scan range maintains a file handle, even
when multiple scan ranges are accessing the same file.
Opening the file handles causes load on the NameNode, which
can lead to scaling issues.

There are two parts to this transaction:
1. Enable file handle caching by default
2. Share the file handle between scan ranges from the same
file

The scan range no longer maintains its own Hdfs file
handle. On each read, the io thread will get the Hdfs file
handle from the cache (opening it if necessary) and use
that for the read. This allows multiple scan ranges on the
same file to use the same file handle. Since the file
offsets are no longer consistent for an individual scan
range, all Hdfs reads need to either use hdfsPread or do
a seek before reading. Additionally, since Hdfs read
statistics are maintained on the file handle, the read
statistics must be retrieved and cleared after each read.

To manage contention, the file handle cache is now
partitioned by a hash of the key into independent
caches with independent locks. The allowed capacity
of the file handle cache is split evenly among the
partitions. File handles are evicted independently
for each partition. The file handle cache maintains
ownership of the file handles at all times, but it
will not evict a file handle that is in use.

If max_cached_file_handles is set to 0 or the the
scan range is accessing data cached by Hdfs, the
scan range will get a file handle from the cache
and hold it until the scan range is closed. This
mimics the existing behavior, except the file
handle stays in the cache and is owned by the
cache. Since it is in use, it will not be evicted.

Tests:
query_test/test_hdfs_fd_caching.py copies the files from
an existing table into a new directory and uses that to
create an external table. It queries the external table,
then uses the hdfs commandline to manipulate the hdfs file
(delete, move, etc). It queries again to make sure we
don't crash. Then, it runs "invalidate metadata". It
checks the row counts before the modification and after
"invalidate metadata", but it does not check the results
in between.
custom_cluster/test_hdfs_fd_caching.py starts up a cluster
with a small file handle cache size. It verifies that a
file handle can be reused (i.e. rerunning a query does
not result in more file handles cached). It also verifies
that the cache capacity is enforced.

Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/runtime/buffered-block-mgr.h
A be/src/runtime/disk-io-mgr-handle-cache.h
A be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr-internal.h
M be/src/runtime/disk-io-mgr-reader-context.cc
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M tests/custom_cluster/test_hdfs_fd_caching.py
M tests/metadata/test_refresh_partition.py
M tests/query_test/test_hdfs_fd_caching.py
13 files changed, 757 insertions(+), 399 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/6478/9
-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-19 Thread Joe McDonnell (Code Review)

Joe McDonnell has posted comments on this change.

Change subject: IMPALA-4623: Enable file handle cache
..


Patch Set 8:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/6478/8/be/src/runtime/disk-io-mgr-handle-cache.h
File be/src/runtime/disk-io-mgr-handle-cache.h:

Line 88:   /// with a call to ReleaseFileHandle to release exclusive control.
> this should also evict handles if a more recent mtime for a file with exist
Added a TODO for this. I plan on dealing with this when doing eviction by 
timeout.

Here is a scenario that I might be concerned about for immediate eviction when 
seeing a higher mtime:

1. ScanRange #1 starts up
2. ScanRange #1 does a read, putting a file handle with mtime=5 in the cache.
3. The file is modified and now has an mtime=6.
4. ScanRange #2 starts up
5. ScanRange #2 does a read, putting a file handle with mtime=6 in the cache 
and removing the mtime=5 file handle.
6. ScanRange #1 wants to do another read and goes to get a file handle from the 
cache, but mtime=5 file handles aren't available. What happens now? It could 
return an error, which makes some sense. It could open a new file handle, but 
it would be looking at the mtime=6 file. The mtime=6 file might have different 
offsets, so it might be reading garbage. 

A lot of this is hypothetical, but it would be nice for a ScanRange to deal 
with a single version of a file. That is a property that the old code had.


Line 98:   static const uint32_t HASH_SEED = 0x87654321;
> where from?
No need for a non-zero hash seed for our purpose, so I have removed this and 
substituted zero in the code.


http://gerrit.cloudera.org:8080/#/c/6478/8/be/src/runtime/disk-io-mgr-handle-cache.inline.h
File be/src/runtime/disk-io-mgr-handle-cache.inline.h:

Line 65:   // Keep going until find unused entry with the same mtime
> fix grammar
Done


Line 113:   // to its place in the map
> add it?
I started working on this. The templating starts to infect everything. I will 
keep looking into it for a followup, but I think I will leave this as a TODO 
for the first version.


Line 117: if (elem->fh.get() == fh) {
> single line
Done


Line 129:   // File can be unbuffered
> where does the corresponding buffering take place?
Added information in the comment. The buffering here is readahead buffering 
that Hdfs does for a read. That is what we are unbuffering.


http://gerrit.cloudera.org:8080/#/c/6478/8/be/src/runtime/disk-io-mgr-scan-range.cc
File be/src/runtime/disk-io-mgr-scan-range.cc:

Line 417:   int last_read = -1;
> what does this mean? does it count anything, and if so, what?
This is the number of bytes read by either hdfsPread or hdfsRead. In case of an 
error, it returns -1. I changed the name to "last_bytes_read"


Line 429: return Status(GetHdfsErrorMsg("Error seeking HDFS file: 
", file_));
> also print position
Done


Line 435: return Status(GetHdfsErrorMsg("Error reading from HDFS file: 
", file_));
> leave todo that you need to retry once to deal with stale cache entries
Done


Line 549: remote_bytes_ += stats->totalBytesRead - 
stats->totalLocalBytesRead;
> why can't remote_bytes_ go into reader_?
The only real obstacle is this piece of logging:

https://github.com/apache/incubator-impala/blob/master/be/src/runtime/disk-io-mgr-scan-range.cc#L341
(It is in a different location after this code change.)

The logging is printing the number of unexpected remote bytes for this scan 
range. If the remote bytes go the reader_, I don't have a way to separate it 
out.

If we don't need the logging, then the state on the ScanRange can be a boolean, 
and we can send remote bytes directly to reader_.


http://gerrit.cloudera.org:8080/#/c/6478/8/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

Line 1300
> this comment disappeared.
Added a comment in the new location. Hdfs can do some readahead buffering, so 
we want to reduce this buffering when putting a file handle in the cache.


Line 278: cached_read_options_(nullptr),
> move into .h file and remove here
Done


http://gerrit.cloudera.org:8080/#/c/6478/8/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

Line 498: int64_t remote_bytes_;
> num_...
Done


http://gerrit.cloudera.org:8080/#/c/6478/8/tests/query_test/test_hdfs_fd_caching.py
File tests/query_test/test_hdfs_fd_caching.py:

Line 74: self.execute_query_expect_success(self.client, count_query)
> this should give some expected result
Done


Line 99: self.execute_query_expect_success(self.client, count_query)
> you should check for the expected result, depending on the modification typ
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe M

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-13 Thread Marcel Kornacker (Code Review)

Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4623: Enable file handle cache
..


Patch Set 8:

(16 comments)

http://gerrit.cloudera.org:8080/#/c/6478/8/be/src/runtime/disk-io-mgr-handle-cache.h
File be/src/runtime/disk-io-mgr-handle-cache.h:

Line 88:   /// with a call to ReleaseFileHandle to release exclusive control.
this should also evict handles if a more recent mtime for a file with existing 
cache entries is being requested (rather than waiting for them to age out).


Line 98:   static const uint32_t HASH_SEED = 0x87654321;
where from?


http://gerrit.cloudera.org:8080/#/c/6478/8/be/src/runtime/disk-io-mgr-handle-cache.inline.h
File be/src/runtime/disk-io-mgr-handle-cache.inline.h:

Line 65:   // Keep going until find unused entry with the same mtime
fix grammar


Line 113:   // to its place in the map
add it?


Line 117: if (elem->fh.get() == fh) {
single line


Line 129:   // File can be unbuffered
where does the corresponding buffering take place?

this looks asymmetrical.


http://gerrit.cloudera.org:8080/#/c/6478/8/be/src/runtime/disk-io-mgr-scan-range.cc
File be/src/runtime/disk-io-mgr-scan-range.cc:

Line 417:   int last_read = -1;
what does this mean? does it count anything, and if so, what?


Line 429: return Status(GetHdfsErrorMsg("Error seeking HDFS file: 
", file_));
also print position


Line 435: return Status(GetHdfsErrorMsg("Error reading from HDFS file: 
", file_));
leave todo that you need to retry once to deal with stale cache entries


Line 549: remote_bytes_ += stats->totalBytesRead - 
stats->totalLocalBytesRead;
why can't remote_bytes_ go into reader_?


http://gerrit.cloudera.org:8080/#/c/6478/8/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

Line 1300
this comment disappeared.

why is this being called?


Line 278: cached_read_options_(nullptr),
move into .h file and remove here


http://gerrit.cloudera.org:8080/#/c/6478/6/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

Line 494: /// times for a scan range, it is necessary to keep some state 
about whether this
the descriptions makes it sounds as if you need a flag, not a count.


http://gerrit.cloudera.org:8080/#/c/6478/8/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

Line 498: int64_t remote_bytes_;
num_...


http://gerrit.cloudera.org:8080/#/c/6478/8/tests/query_test/test_hdfs_fd_caching.py
File tests/query_test/test_hdfs_fd_caching.py:

Line 74: self.execute_query_expect_success(self.client, count_query)
this should give some expected result


Line 99: self.execute_query_expect_success(self.client, count_query)
you should check for the expected result, depending on the modification type


-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-12 Thread Joe McDonnell (Code Review)

Joe McDonnell has uploaded a new patch set (#8).

Change subject: IMPALA-4623: Enable file handle cache
..

IMPALA-4623: Enable file handle cache

Currently, every scan range maintains a file handle, even
when multiple scan ranges are accessing the same file.
Opening the file handles causes load on the NameNode, which
can lead to scaling issues.

There are two parts to this transaction:
1. Enable file handle caching by default
2. Share the file handle between scan ranges from the same
file

The scan range no longer maintains its own Hdfs file
handle. On each read, the io thread will get the Hdfs file
handle from the cache (opening it if necessary) and use
that for the read. This allows multiple scan ranges on the
same file to use the same file handle. Since the file
offsets are no longer consistent for an individual scan
range, all Hdfs reads need to either use hdfsPread or do
a seek before reading. Additionally, since Hdfs read
statistics are maintained on the file handle, the read
statistics must be retrieved and cleared after each read.

To manage contention, the file handle cache is now
partitioned by a hash of the key into independent
caches with independent locks. The allowed capacity
of the file handle cache is split evenly among the
partitions. File handles are evicted independently
for each partition. The file handle cache maintains
ownership of the file handles at all times, but it
will not evict a file handle that is in use.

If max_cached_file_handles is set to 0 or the the
scan range is accessing data cached by Hdfs, the
scan range will get a file handle from the cache
and hold it until the scan range is closed. This
mimics the existing behavior, except the file
handle stays in the cache and is owned by the
cache. Since it is in use, it will not be evicted.

Tests:
test_hdfs_fd_caching.py copies the files from an existing
table into a new directory and uses that to create an
external table. It queries the external table, then
uses the hdfs commandline to manipulate the hdfs file
(delete, move, etc). It queries again to make sure we
don't crash. Then, it runs "invalidate metadata". It
does not check the content of the rows returned.

TODO:
1. Determine appropriate defaults.
2. Other tests
  a. File overwrite
  b. Any others?
3. For scan ranages that use Hdfs caching, should there
be some sharing at the scanner level?

Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/runtime/buffered-block-mgr.h
A be/src/runtime/disk-io-mgr-handle-cache.h
A be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr-internal.h
M be/src/runtime/disk-io-mgr-reader-context.cc
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M tests/custom_cluster/test_hdfs_fd_caching.py
M tests/metadata/test_refresh_partition.py
M tests/query_test/test_hdfs_fd_caching.py
13 files changed, 718 insertions(+), 390 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/6478/8
-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-11 Thread Joe McDonnell (Code Review)

Joe McDonnell has uploaded a new patch set (#7).

Change subject: IMPALA-4623: Enable file handle cache
..

IMPALA-4623: Enable file handle cache

Currently, every scan range maintains a file handle, even
when multiple scan ranges are accessing the same file.
Opening the file handles causes load on the NameNode, which
can lead to scaling issues.

There are two parts to this transaction:
1. Enable file handle caching by default
2. Share the file handle between scan ranges from the same
file

The scan range no longer maintains its own Hdfs file
handle. On each read, the io thread will get the Hdfs file
handle from the cache (opening it if necessary) and use
that for the read. This allows multiple scan ranges on the
same file to use the same file handle. Since the file
offsets are no longer consistent for an individual scan
range, all Hdfs reads need to either use hdfsPread or do
a seek before reading. Additionally, since Hdfs read
statistics are maintained on the file handle, the read
statistics must be retrieved and cleared after each read.

To manage contention, the file handle cache is now
partitioned by a hash of the key into independent
caches with independent locks. The allowed capacity
of the file handle cache is split evenly among the
partitions. File handles are evicted independently
for each partition. The file handle cache maintains
ownership of the file handles at all times, but it
will not evict a file handle that is in use.

If max_cached_file_handles is set to 0 or the the
scan range is accessing data cached by Hdfs, the
scan range will get a file handle from the cache
and hold it until the scan range is closed. This
mimics the existing behavior, except the file
handle stays in the cache and is owned by the
cache. Since it is in use, it will not be evicted.

Tests:
test_hdfs_fd_caching.py copies the files from an existing
table into a new directory and uses that to create an
external table. It queries the external table, then
uses the hdfs commandline to manipulate the hdfs file
(delete, move, etc). It queries again to make sure we
don't crash. Then, it runs "invalidate metadata". It
does not check the content of the rows returned.

TODO:
1. Determine appropriate defaults.
2. Other tests
  a. File overwrite
  b. Any others?
3. For scan ranages that use Hdfs caching, should there
be some sharing at the scanner level?

Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/runtime/buffered-block-mgr.h
A be/src/runtime/disk-io-mgr-handle-cache.h
A be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr-internal.h
M be/src/runtime/disk-io-mgr-reader-context.cc
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M tests/metadata/test_refresh_partition.py
M tests/query_test/test_hdfs_fd_caching.py
12 files changed, 702 insertions(+), 374 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/6478/7
-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-11 Thread Joe McDonnell (Code Review)

Joe McDonnell has posted comments on this change.

Change subject: IMPALA-4623: Enable file handle cache
..


Patch Set 4:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/6478/5/be/src/exec/hdfs-scan-node-base.cc
File be/src/exec/hdfs-scan-node-base.cc:

Line 92:   tnode.hdfs_scan_node.parquet_mr_write_zone : ""),
> you can also initialize that inline in the .h file
Done, also did this for counters_running_


http://gerrit.cloudera.org:8080/#/c/6478/5/be/src/runtime/disk-io-mgr-handle-cache.h
File be/src/runtime/disk-io-mgr-handle-cache.h:

Line 105
> no trailing _ for structs
Done


http://gerrit.cloudera.org:8080/#/c/6478/6/be/src/runtime/disk-io-mgr-handle-cache.h
File be/src/runtime/disk-io-mgr-handle-cache.h:

Line 116
> provide comments for variables, in particular invariants (e.g., it looks li
Done


http://gerrit.cloudera.org:8080/#/c/6478/5/be/src/runtime/disk-io-mgr-handle-cache.inline.h
File be/src/runtime/disk-io-mgr-handle-cache.inline.h:

Line 33
> could these be null?
hdfs_file_ can be null when we fail to open the file (hdfsOpenFile returns 
null).


http://gerrit.cloudera.org:8080/#/c/6478/6/be/src/runtime/disk-io-mgr-handle-cache.inline.h
File be/src/runtime/disk-io-mgr-handle-cache.inline.h:

Line 61
> also: use something out of util/hash-util.h
Done


http://gerrit.cloudera.org:8080/#/c/6478/4/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

Line 1309:   string key = string(fname) + ":" + std::to_string(mtime);
> probably a good idea, it's unlikely that we'll see a lot of overwritten fil
Changed this to hash the filename and then match the mtime on iterating.


-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-10 Thread Marcel Kornacker (Code Review)

Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4623: Enable file handle cache
..


Patch Set 6:

(6 comments)

quick comments

http://gerrit.cloudera.org:8080/#/c/6478/5/be/src/exec/hdfs-scan-node-base.cc
File be/src/exec/hdfs-scan-node-base.cc:

Line 92:   initial_ranges_issued_(false),
you can also initialize that inline in the .h file


http://gerrit.cloudera.org:8080/#/c/6478/5/be/src/runtime/disk-io-mgr-handle-cache.h
File be/src/runtime/disk-io-mgr-handle-cache.h:

Line 105: std::unique_ptr fh_;
no trailing _ for structs


http://gerrit.cloudera.org:8080/#/c/6478/6/be/src/runtime/disk-io-mgr-handle-cache.h
File be/src/runtime/disk-io-mgr-handle-cache.h:

Line 116: MapType cache_;
provide comments for variables, in particular invariants (e.g., it looks like 
the lru list only contains unused handles)


http://gerrit.cloudera.org:8080/#/c/6478/5/be/src/runtime/disk-io-mgr-handle-cache.inline.h
File be/src/runtime/disk-io-mgr-handle-cache.inline.h:

Line 33:   if (hdfs_file_ != nullptr && fs_ != nullptr) {
could these be null?


http://gerrit.cloudera.org:8080/#/c/6478/6/be/src/runtime/disk-io-mgr-handle-cache.inline.h
File be/src/runtime/disk-io-mgr-handle-cache.inline.h:

Line 61:   int index = hash()(key) % NUM_PARTITIONS;
also: use something out of util/hash-util.h


http://gerrit.cloudera.org:8080/#/c/6478/4/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

Line 1309
> I use this as the key for the map, so switching the hash alone wouldn't eli
probably a good idea, it's unlikely that we'll see a lot of overwritten files.


-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-10 Thread Joe McDonnell (Code Review)

Joe McDonnell has uploaded a new patch set (#6).

Change subject: IMPALA-4623: Enable file handle cache
..

IMPALA-4623: Enable file handle cache

Currently, every scan range maintains a file handle, even
when multiple scan ranges are accessing the same file.
Opening the file handles causes load on the NameNode, which
can lead to scaling issues.

There are two parts to this transaction:
1. Enable file handle caching by default
2. Share the file handle between scan ranges from the same
file

The scan range no longer maintains its own Hdfs file
handle. On each read, the io thread will get the Hdfs file
handle from the cache (opening it if necessary) and use
that for the read. This allows multiple scan ranges on the
same file to use the same file handle. Since the file
offsets are no longer consistent for an individual scan
range, all Hdfs reads need to either use hdfsPread or do
a seek before reading. Additionally, since Hdfs read
statistics are maintained on the file handle, the read
statistics must be retrieved and cleared after each read.

Scan ranges that are accessing data cached by Hdfs
will get a file handle from the cache, but the file
handle will be kept on the scan range for the time
that the scan range is in use. This prevents the
cache from closing the file handle while the data
buffer is in use.

To manage contention, the file handle cache is now
partitioned by a hash of the key into independent
caches with independent locks. The allowed capacity
of the file handle cache is split evenly among the
partitions. File handles are evicted independently
for each partition. The file handle cache maintains
ownership of the file handles at all times, but it
will not evict a file handle that is in use.

If max_cached_file_handles is set to 0, file handle
caching is off and the previous behavior applies.

Tests:
test_hdfs_fd_caching.py copies the files from an existing
table into a new directory and uses that to create an
external table. It queries the external table, then
uses the hdfs commandline to manipulate the hdfs file
(delete, move, etc). It queries again to make sure we
don't crash. Then, it runs "invalidate metadata". In
the delete case, we expect zero rows. In the move case,
we expect the same number of rows.

TODO:
1. Determine appropriate defaults.
2. Other tests
  a. File overwrite
  b. Any others?
3. For scan ranages that use Hdfs caching, should there
be some sharing at the scanner level?

Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/runtime/buffered-block-mgr.h
A be/src/runtime/disk-io-mgr-handle-cache.h
A be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr-internal.h
M be/src/runtime/disk-io-mgr-reader-context.cc
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M tests/query_test/test_hdfs_fd_caching.py
11 files changed, 679 insertions(+), 368 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/6478/6
-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-10 Thread Joe McDonnell (Code Review)

Joe McDonnell has posted comments on this change.

Change subject: IMPALA-4623: Enable file handle cache
..


Patch Set 4:

(4 comments)

Fixed the test comments and fixed some errors in the code found by other 
existing tests (custom_cluster/test_hdfs_fd_caching.py tests the metrics).

http://gerrit.cloudera.org:8080/#/c/6478/4/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

Line 1331:   ImpaladMetrics::IO_MGR_NUM_CACHED_FILE_HANDLES->Increment(-1L);
> I'm splitting hairs about the exact meaning of cached. 
I found an existing test that was checking the metrics, and I've changed this 
code to match what that test was expecting.

IO_MGR_NUM_FILE_HANDLES_OUTSTANDING is the number of file handles in use.
IO_MGR_NUM_CACHED_FILE_HANDLES now is the count of all files handles. It would 
also make sense for this to exclude the oustanding file handles, but right now, 
this counts all file handles.


http://gerrit.cloudera.org:8080/#/c/6478/4/tests/query_test/test_hdfs_fd_caching.py
File tests/query_test/test_hdfs_fd_caching.py:

Line 76: # File deleted: either # rows is the same or # rows = 0
> we don't make any guarantees for this, so might as well not check the resul
Changed to not check the results.


Line 98: new_table_location = 
"{0}/test-warehouse/{1}".format(FILESYSTEM_PREFIX, unique_database)
> long lines
Done


Line 105: # Delete the whole directory (including data file)
> this function is pretty much identical to the one above (and below), except
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-08 Thread Joe McDonnell (Code Review)

Joe McDonnell has uploaded a new patch set (#5).

Change subject: IMPALA-4623: Enable file handle cache
..

IMPALA-4623: Enable file handle cache

Currently, every scan range maintains a file handle, even
when multiple scan ranges are accessing the same file.
Opening the file handles causes load on the NameNode, which
can lead to scaling issues.

There are two parts to this transaction:
1. Enable file handle caching by default
2. Share the file handle between scan ranges from the same
file

The scan range no longer maintains its own Hdfs file
handle. On each read, the io thread will get the Hdfs file
handle from the cache (opening it if necessary) and use
that for the read. This allows multiple scan ranges on the
same file to use the same file handle. Since the file
offsets are no longer consistent for an individual scan
range, all Hdfs reads need to either use hdfsPread or do
a seek before reading. Additionally, since Hdfs read
statistics are maintained on the file handle, the read
statistics must be retrieved and cleared after each read.

Scan ranges that are accessing data cached by Hdfs
will get a file handle from the cache, but the file
handle will be kept on the scan range for the time
that the scan range is in use. This prevents the
cache from closing the file handle while the data
buffer is in use.

To manage contention, the file handle cache is now
partitioned by a hash of the key into independent
caches with independent locks. The allowed capacity
of the file handle cache is split evenly among the
partitions. File handles are evicted independently
for each partition. The file handle cache maintains
ownership of the file handles at all times, but it
will not evict a file handle that is in use.

If max_cached_file_handles is set to 0, file handle
caching is off and the previous behavior applies.

Tests:
test_hdfs_fd_caching.py copies the files from an existing
table into a new directory and uses that to create an
external table. It queries the external table, then
uses the hdfs commandline to manipulate the hdfs file
(delete, move, etc). It queries again to make sure we
don't crash. Then, it runs "invalidate metadata". In
the delete case, we expect zero rows. In the move case,
we expect the same number of rows.

TODO:
1. Determine appropriate defaults.
2. Other tests
  a. File overwrite
  b. Any others?
3. For scan ranages that use Hdfs caching, should there
be some sharing at the scanner level?

Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/runtime/buffered-block-mgr.h
A be/src/runtime/disk-io-mgr-handle-cache.h
A be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr-internal.h
M be/src/runtime/disk-io-mgr-reader-context.cc
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M tests/query_test/test_hdfs_fd_caching.py
11 files changed, 732 insertions(+), 368 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/6478/5
-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-08 Thread Joe McDonnell (Code Review)

Joe McDonnell has posted comments on this change.

Change subject: IMPALA-4623: Enable file handle cache
..


Patch Set 4:

(23 comments)

Still fixing the test, but here is an update on the rest of it.

http://gerrit.cloudera.org:8080/#/c/6478/4/be/src/exec/hdfs-scan-node-base.cc
File be/src/exec/hdfs-scan-node-base.cc:

Line 894: 
runtime_state_->io_mgr()->cached_file_handles_hit_count(reader_context_));
> pretty awkward. why not call the reader directly (and maybe introduce a new
DiskIoRequestContext is an opaque type. disk-io-mgr.h declares it, but the body 
is defined in disk-io-mgr-internal.h, and we don't include that anywhere 
outside of disk-io-mgr*.

It is annoying.


http://gerrit.cloudera.org:8080/#/c/6478/4/be/src/exec/hdfs-scan-node-base.h
File be/src/exec/hdfs-scan-node-base.h:

Line 441:   RuntimeProfile::Counter* cached_file_handles_hit_count_;
> initialize to nullptr here, that way you don't need to do it in the c'tor (
Done


http://gerrit.cloudera.org:8080/#/c/6478/4/be/src/runtime/buffered-block-mgr.h
File be/src/runtime/buffered-block-mgr.h:

Line 25: #include 
> why?
boost/unordered_map.hpp is needed, but it was being implicitly included by 
including lru-cache.h in disk-io-mgr.h. I removed lru-cache.h from 
disk-io-mgr.h, which broke this.


http://gerrit.cloudera.org:8080/#/c/6478/4/be/src/runtime/disk-io-mgr-scan-range.cc
File be/src/runtime/disk-io-mgr-scan-range.cc:

Line 342:   io_mgr_->CacheOrCloseFileHandle(file(), exclusive_hdfs_fh_);
> yes, the name really needs to distinguish between cache operations and dire
Done


Line 548: remote_bytes_ += stats->totalBytesRead - 
stats->totalLocalBytesRead;
> why not stick that into reader_ as well?
Added a comment in disk-io-mgr.h


http://gerrit.cloudera.org:8080/#/c/6478/4/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

Line 1309:   string key = string(fname) + ":" + std::to_string(mtime);
> you can get rid of dealing with strings here (and doing a bunch of allocati
I use this as the key for the map, so switching the hash alone wouldn't 
eliminate this. However, I could hash on the filename and use the filename as 
the key, then as it is iterating through file handles for that filename, find 
one that is free with a matching mtime.


Line 1315:   // Get lock
> that's obvious, no need to paraphrase a single statement (unless there's so
Done


Line 1318:   // Look up in the multimap (find range of values for key)
> same here
Done


Line 1326: if (!elem->reference_count_) {
> use explicit numeric comparisons, this isn't a bool (although maybe it shou
Changed to bool.


Line 1331:   ImpaladMetrics::IO_MGR_NUM_CACHED_FILE_HANDLES->Increment(-1L);
> hm, i'm not sure we should do that. it's still a cached file handle.
I'm splitting hairs about the exact meaning of cached. 
Every file handle is always in the cache, so IO_MGR_NUM_CACHED_FILE_HANDLES 
could be the same as IO_MGR_NUM_FILE_HANDLES_OUTSTANDING. That would be 
understandable.

The meaning I'm using here is that this is a file handle that is in the cache 
and not in use. I'm not sure if that is useful.


Line 1395: if (p.size_ > p.capacity_) {
> single line
Done


http://gerrit.cloudera.org:8080/#/c/6478/4/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

Line 231:   /// Threads checkout a file handle for exclusive access and return 
it when finished.
> "check out" (checkout is a noun)
Done


Line 237:   template 
> please pull this along with hdfsfilehandle out into a separate file, this f
Moved this out to disk-io-mgr-handle-cache.h/disk-io-mgr-handle-cache.inline.h. 
disk-io-mgr-handle-cache.h is included in disk-io-mgr.h. 
disk-io-mgr-handle-cache.inline.h is only used in disk-io-mgr.cc


Line 242: /// first k partitions will each have one additional file handle.
> let's make the size uniform across all partitions (e.g., by rounding up), t
Done


Line 273:: fh_(fh), reference_count_(0) {}
> single line
Done


Line 276:   int reference_count_;
> why refcount instead of just 'bool in_use' or something like that?
Switched to in_use.


Line 282: /// the partitions are aligned to cache line boundaries (64 
bytes).
> hm, isn't there some predefined constant somewhere for the cache line size?
std::hardware_destructive_interference_size is in C++17. There is a way to 
query the system to get it, but I haven't seen a compile time definition.


Line 549: Status Open(bool use_file_handle_cache) WARN_UNUSED_RESULT;
> does it ever make sense to call open on the same scan range with different 
This happens on the HDFS caching codepath. If we think a scan range is cached, 
we call Open(false), but then if caching falls through, we call Close and then 
reuse the scan range on the normal codepath, which can call it with either true 
or false. 

Clarified the comment.


Line 573: void* meta_data_;
> initialize nullptr here (for all poi

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-08 Thread Marcel Kornacker (Code Review)

Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4623: Enable file handle cache
..


Patch Set 4:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/6478/4/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

Line 1309:   string key = string(fname) + ":" + std::to_string(mtime);
you can get rid of dealing with strings here (and doing a bunch of allocations) 
entirely by using something out of util/hash-util.h to compute the hash on 
fname and mtime separately (no need to convert it to a string).


Line 1315:   // Get lock
that's obvious, no need to paraphrase a single statement (unless there's 
something subtle going on).


Line 1318:   // Look up in the multimap (find range of values for key)
same here


Line 1326: if (!elem->reference_count_) {
use explicit numeric comparisons, this isn't a bool (although maybe it should 
be).


Line 1331:   ImpaladMetrics::IO_MGR_NUM_CACHED_FILE_HANDLES->Increment(-1L);
hm, i'm not sure we should do that. it's still a cached file handle.


Line 1395: if (p.size_ > p.capacity_) {
single line


http://gerrit.cloudera.org:8080/#/c/6478/4/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

Line 273:: fh_(fh), reference_count_(0) {}
single line


Line 276:   int reference_count_;
why refcount instead of just 'bool in_use' or something like that?


-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-06 Thread Marcel Kornacker (Code Review)

Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4623: Enable file handle cache
..


Patch Set 4:

(18 comments)

looking good.

http://gerrit.cloudera.org:8080/#/c/6478/4/be/src/exec/hdfs-scan-node-base.cc
File be/src/exec/hdfs-scan-node-base.cc:

Line 894: 
runtime_state_->io_mgr()->cached_file_handles_hit_count(reader_context_));
pretty awkward. why not call the reader directly (and maybe introduce a new 
getter)?


http://gerrit.cloudera.org:8080/#/c/6478/4/be/src/exec/hdfs-scan-node-base.h
File be/src/exec/hdfs-scan-node-base.h:

Line 441:   RuntimeProfile::Counter* cached_file_handles_hit_count_;
initialize to nullptr here, that way you don't need to do it in the c'tor 
(which you have to edit if the member order changes, or which you might have to 
duplicate if you multiple c'tors).

do that for all pointer members.


http://gerrit.cloudera.org:8080/#/c/6478/4/be/src/runtime/buffered-block-mgr.h
File be/src/runtime/buffered-block-mgr.h:

Line 25: #include 
why?


http://gerrit.cloudera.org:8080/#/c/6478/4/be/src/runtime/disk-io-mgr-scan-range.cc
File be/src/runtime/disk-io-mgr-scan-range.cc:

Line 342:   io_mgr_->CacheOrCloseFileHandle(file(), exclusive_hdfs_fh_);
yes, the name really needs to distinguish between cache operations and direct 
operations on handles.


Line 548: remote_bytes_ += stats->totalBytesRead - 
stats->totalLocalBytesRead;
why not stick that into reader_ as well?


http://gerrit.cloudera.org:8080/#/c/6478/4/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

Line 231:   /// Threads checkout a file handle for exclusive access and return 
it when finished.
"check out" (checkout is a noun)


Line 237:   template 
please pull this along with hdfsfilehandle out into a separate file, this file 
is large enough as it is.


Line 242: /// first k partitions will each have one additional file handle.
let's make the size uniform across all partitions (e.g., by rounding up), that 
should simplify the code.


Line 282: /// the partitions are aligned to cache line boundaries (64 
bytes).
hm, isn't there some predefined constant somewhere for the cache line size?


Line 549: Status Open(bool use_file_handle_cache) WARN_UNUSED_RESULT;
does it ever make sense to call open on the same scan range with different 
values of the bool? if not, make a parameter of the c'tor?

point out clearly in the comment that if the param is true, open does nothing.


Line 573: void* meta_data_;
initialize nullptr here (for all pointer members)


Line 587: /// Total number of bytes read remotely
point out why we need to maintain this particular metric instead of in 
requestcontext.


Line 601: /// handle, and no sharing takes place.
also mention when this is populated and how long it's valid.


Line 856:   HdfsFileHandle* OpenHdfsFile(const hdfsFS& fs,
if this always returns a cached handle, would be good to reflect that in the 
name.


Line 860:   void CacheOrCloseFileHandle(const char* fname, HdfsFileHandle* fid);
rename


http://gerrit.cloudera.org:8080/#/c/6478/4/tests/query_test/test_hdfs_fd_caching.py
File tests/query_test/test_hdfs_fd_caching.py:

Line 76: # File deleted: either # rows is the same or # rows = 0
we don't make any guarantees for this, so might as well not check the result. 
we just don't want to crash.


Line 98: new_table_location = 
"{0}/test-warehouse/{1}".format(FILESYSTEM_PREFIX, unique_database)
long lines


Line 105: # Delete the whole directory (including data file)
this function is pretty much identical to the one above (and below), except for 
the specific modification action. consolidate into a single function and drive 
that with an enum parameter.


-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-05 Thread Joe McDonnell (Code Review)

Joe McDonnell has uploaded a new patch set (#4).

Change subject: IMPALA-4623: Enable file handle cache
..

IMPALA-4623: Enable file handle cache

Currently, every scan range maintains a file handle, even
when multiple scan ranges are accessing the same file.
Opening the file handles causes load on the NameNode, which
can lead to scaling issues.

There are two parts to this transaction:
1. Enable file handle caching by default
2. Share the file handle between scan ranges from the same
file

The scan range no longer maintains its own Hdfs file
handle. On each read, the io thread will get the Hdfs file
handle from the cache (opening it if necessary) and use
that for the read. This allows multiple scan ranges on the
same file to use the same file handle. Since the file
offsets are no longer consistent for an individual scan
range, all Hdfs reads need to either use hdfsPread or do
a seek before reading. Additionally, since Hdfs read
statistics are maintained on the file handle, the read
statistics must be retrieved and cleared after each read.

Scan ranges that are accessing data cached by Hdfs
will get a file handle from the cache, but the file
handle will be kept on the scan range for the time
that the scan range is in use. This prevents the
cache from closing the file handle while the data
buffer is in use.

To manage contention, the file handle cache is now
partitioned by a hash of the key into independent
caches with independent locks. The allowed capacity
of the file handle cache is split evenly among the
partitions. File handles are evicted independently
for each partition. The file handle cache maintains
ownership of the file handles at all times, but it
will not evict a file handle that is in use.

If max_cached_file_handles is set to 0, file handle
caching is off and the previous behavior applies.

Tests:
test_hdfs_fd_caching.py copies the files from an existing
table into a new directory and uses that to create an
external table. It queries the external table, then
uses the hdfs commandline to manipulate the hdfs file
(delete, move, etc). It queries again to make sure we
don't crash. Then, it runs "invalidate metadata". In
the delete case, we expect zero rows. In the move case,
we expect the same number of rows.

TODO:
1. Determine appropriate defaults.
2. Other tests
  a. File overwrite
  b. Any others?
3. For scan ranages that use Hdfs caching, should there
be some sharing at the scanner level?

Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/runtime/buffered-block-mgr.h
M be/src/runtime/disk-io-mgr-internal.h
M be/src/runtime/disk-io-mgr-reader-context.cc
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M tests/query_test/test_hdfs_fd_caching.py
9 files changed, 639 insertions(+), 311 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/6478/4
-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-04-10 Thread Marcel Kornacker (Code Review)

Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4623: Enable file handle cache
..


Patch Set 3:

(9 comments)

i don't understand the need for unique_ptrs for the file handles, please see 
comments inline.

http://gerrit.cloudera.org:8080/#/c/6478/3/be/src/runtime/disk-io-mgr-scan-range.cc
File be/src/runtime/disk-io-mgr-scan-range.cc:

Line 286:   if (fs_ != NULL) {
replace w/ nullptr throughout (and in the other files you touch).


http://gerrit.cloudera.org:8080/#/c/6478/3/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

Line 205:   /// additionally encapsulates the last modified time of the 
associated file when it was
fix this sentence.


Line 207:   class HdfsCachedFileHandle {
we use this class exclusively to access hdfs data, even if the file handle 
isn't cached. remove 'cached' from the name, it's distracting.


Line 476: /// Opens the file for this range. This function only modifies 
state in this range.
explain param


Line 489: void GetHdfsStatistics(hdfsFile fh);
is this okay to pass by value?

what diskiorequestcontext is this referring to? the comment should make sense 
on its own.


Line 521: /// File handle for local fs (FILE*)
remove (file*)


Line 526: std::unique_ptr hdfs_file_;
requestrange already contains an fs_. what of  the functionality provided by 
hdfscachedfilehandle do you actually need here? this class also contains an 
mtime_, so lots of confusing and error-prone duplication.

this encapsulates a file handle out of the cache. but if the cache is used to 
manage all file handles, why do you need to store this with a unique_ptr 
(instead of a raw ptr, and then simply hand that back to the cache when you're 
done)? this looks like it contradicts the responsibility of the cache for 
managing file handles.


Line 774:   /// and return an instance of HdfsCachedFileHandle. In case of an 
error returns NULL.
explain all parameters


Line 784:   std::unique_ptr& fid);
no ref parameters


-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-04-10 Thread Marcel Kornacker (Code Review)

Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4623: Enable file handle cache
..


Patch Set 3:

(11 comments)

initial thoughts on the partitioned map.

http://gerrit.cloudera.org:8080/#/c/6478/3/be/src/util/lru-cache.h
File be/src/util/lru-cache.h:

Line 67:   typedef void (*DeleterFn)(Value&);
why now a ref?


Line 82:   /// Variant of Put with move semantics.
is this needed?


Line 91:   bool Pop(const Key& k, Value& out);
we typically don't use refs for this purpose. why the switch?


Line 102:   static void DummyDeleter(Value& v) {}
make private


Line 115:   typedef std::pair ValueType;
value of what?


Line 139: /// Wrapper around FifoMultimap that hashes the keys into a fixed 
number of partitions
which hash fn?


Line 146: template
you can make the number of partitions another template parameter, that way you 
can use a std::array for cache_partitions_. i also don't see a real need for 
the unique_ptr in there. fewer indirection should be faster.

it would also be good to put the individual Fifomultimap objects on cache line 
boundaries. alignas might help here.


Line 151:   /// See FifoMultimap
this doesn't explain how you deal with the remainder.

you could also simplify this and stipulate that capacity is per-partition.


http://gerrit.cloudera.org:8080/#/c/6478/3/be/src/util/lru-cache.inline.h
File be/src/util/lru-cache.inline.h:

Line 36:   const typename MapType::value_type& val =
i think if you use 'using' for MapType instead of typedef you can get rid of 
the typename.


Line 37: std::make_pair(k, lru_list_.emplace(lru_list_.end(), k, 
std::move(v)));
is the explicit move needed for the rvalue v?


Line 51: void FifoMultimap::Put(const Key& k, const Value& v) {
there should be a way not to require basically a verbatim copy of this function


-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-04-07 Thread Joe McDonnell (Code Review)

Joe McDonnell has uploaded a new patch set (#3).

Change subject: IMPALA-4623: Enable file handle cache
..

IMPALA-4623: Enable file handle cache

Currently, every scan range maintains a file handle, even
when multiple scan ranges are accessing the same file.
Open the file handles causes load on the NameNode, which
can lead to scaling issues.

There are two parts to this transaction:
1. Enable file handle caching by default
2. Share the file handle between scan ranges from the same
file

The scan range no longer maintains its own Hdfs file
handle. On each read, the io thread will get the Hdfs file
handle from the cache (opening it if necessary) and use
that for the read. This allows multiple scan ranges on the
same file to use the same file handle. Since the file
offsets are no longer consistent for an individual scan
range, all Hdfs reads need to either use hdfsPread or do
a seek before reading. Additionally, since Hdfs read
statistics are maintained on the file handle, the read
statistics must be retrieved and cleared after each read.

Scan ranges that are accessing data cached by Hdfs
will get a file handle from the cache, but the file
handle will be kept on the scan range for the time
that the scan range is in use. This prevents the
cache from closing the file handle while the data
buffer is in use.

To manage contention, the file handle cache is now
partitioned by a hash of the key into independent
caches with independent locks. The allowed capacity
of the file handle cache is split evenly among the
partitions. File handles are evicted independently
for each partition.

If max_cached_file_handles is set to 0, file handle
caching is off and the previous behavior applies.

Tests:
test_hdfs_fd_caching.py copies the files from an existing
table into a new directory and uses that to create an
external table. It queries the external table, then
uses the hdfs commandline to manipulate the hdfs file
(delete, move, etc). It queries again to make sure we
don't crash. Then, it runs "invalidate metadata". In
the delete case, we expect zero rows. In the move case,
we expect the same number of rows.

TODO:
1. Determine appropriate defaults.
2. Other tests
  a. File overwrite
  b. Any others?
3. For scan ranages that use Hdfs caching, should there
be some sharing at the scanner level?

Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/runtime/buffered-block-mgr.h
M be/src/runtime/disk-io-mgr-internal.h
M be/src/runtime/disk-io-mgr-reader-context.cc
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M be/src/util/lru-cache-test.cc
M be/src/util/lru-cache.h
M be/src/util/lru-cache.inline.h
M tests/query_test/test_hdfs_fd_caching.py
12 files changed, 498 insertions(+), 166 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/6478/3
-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong

[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-04-05 Thread Joe McDonnell (Code Review)

Joe McDonnell has uploaded a new patch set (#2).

Change subject: IMPALA-4623: Enable file handle cache
..

IMPALA-4623: Enable file handle cache

Currently, every scan range maintains a file handle, even
when multiple scan ranges are accessing the same file.
Open the file handles causes load on the NameNode, which
can lead to scaling issues.

There are two parts to this transaction:
1. Enable file handle caching by default
2. Share the file handle between scan ranges from the same
file

The scan range no longer maintains its own Hdfs file
handle. On each read, the io thread will get the Hdfs file
handle from the cache (opening it if necessary) and use
that for the read. This allows multiple scan ranges on the
same file to use the same file handle. Since the file
offsets are no longer consistent for an individual scan
range, all Hdfs reads need to either use hdfsPread or do
a seek before reading. Additionally, since Hdfs read
statistics are maintained on the file handle, the read
statistics must be retrieved and cleared after each read.

Scan ranges that are accessing data cached by Hdfs
will get a file handle from the cache, but the file
handle will be kept on the scan range for the time
that the scan range is in use. This prevents the
cache from closing the file handle while the data
buffer is in use.

To manage contention, the file handle cache is now
partitioned by a hash of the key into independent
caches with independent locks. The allowed capacity
of the file handle cache is split evenly among the
partitions. File handles are evicted independently
for each partition.

If max_cached_file_handles is set to 0, file handle
caching is off and the previous behavior applies.

TODO:
1. Determine appropriate defaults.
2. Write tests
  a. Delete test
  b. Block rebalance test?
3. For scan ranages that use Hdfs caching, should there
be some sharing at the scanner level?

Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/runtime/buffered-block-mgr.h
M be/src/runtime/disk-io-mgr-internal.h
M be/src/runtime/disk-io-mgr-reader-context.cc
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M be/src/util/lru-cache-test.cc
M be/src/util/lru-cache.h
M be/src/util/lru-cache.inline.h
11 files changed, 387 insertions(+), 124 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/6478/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong

53 matches

Mail list logo