[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests
Tianyi Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/10413 ) Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/10413/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/10413/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@118 PS6, Line 118: FileBlock.createFbFileBlock(fbb, : loc.getOffset(), loc.getLength(), : (short) hostIndex.getIndex(REMOTE_NETWORK_ADDRESS)); > I can try changing it to generating scan ranges in the backend later. I thi Sorry I didn't see your last comment. I didn't saw to L186... Then there shouldn't be a difference between synthesizing and using the existing block locations. I will do the backend-scanrange adoption -- To view, visit http://gerrit.cloudera.org:8080/10413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 Gerrit-Change-Number: 10413 Gerrit-PatchSet: 6 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 22 May 2018 18:34:35 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10413 ) Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/10413/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/10413/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@118 PS6, Line 118: FileBlock.createFbFileBlock(fbb, : loc.getOffset(), loc.getLength(), : (short) hostIndex.getIndex(REMOTE_NETWORK_ADDRESS)); > The reason is not strong: Though EC reads are remote, we might still don't ok, so you were worried that synthetic division into blocks could differ from how a file is chunked into blocks on hdfs. Assuming the block size provided by FileStatus never changes (L186), when would you see that the two different chunking schemes differ? -- To view, visit http://gerrit.cloudera.org:8080/10413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 Gerrit-Change-Number: 10413 Gerrit-PatchSet: 6 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 22 May 2018 18:27:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests
Tianyi Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/10413 ) Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/10413/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/10413/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@118 PS6, Line 118: FileBlock.createFbFileBlock(fbb, : loc.getOffset(), loc.getLength(), : (short) hostIndex.getIndex(REMOTE_NETWORK_ADDRESS)); > The reason is not strong: Though EC reads are remote, we might still don't I can try changing it to generating scan ranges in the backend later. I think it's fine to leave it as it is now and merge your change. -- To view, visit http://gerrit.cloudera.org:8080/10413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 Gerrit-Change-Number: 10413 Gerrit-PatchSet: 6 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 22 May 2018 18:27:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests
Tianyi Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/10413 ) Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/10413/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/10413/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@118 PS6, Line 118: FileBlock.createFbFileBlock(fbb, : loc.getOffset(), loc.getLength(), : (short) hostIndex.getIndex(REMOTE_NETWORK_ADDRESS)); > I know this is merged, but just wanted to revisit why we're storing block l The reason is not strong: Though EC reads are remote, we might still don't want a block to be read as 2 blocks. Reading across block boundary might lead to more connections to name nodes and data nodes. -- To view, visit http://gerrit.cloudera.org:8080/10413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 Gerrit-Change-Number: 10413 Gerrit-PatchSet: 6 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 22 May 2018 18:17:42 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10413 ) Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/10413/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/10413/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@118 PS6, Line 118: FileBlock.createFbFileBlock(fbb, : loc.getOffset(), loc.getLength(), : (short) hostIndex.getIndex(REMOTE_NETWORK_ADDRESS)); I know this is merged, but just wanted to revisit why we're storing block locations for ec files this way rather than synthesizing them via L137? -- To view, visit http://gerrit.cloudera.org:8080/10413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 Gerrit-Change-Number: 10413 Gerrit-PatchSet: 6 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 22 May 2018 17:58:42 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10413 ) Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests .. IMPALA-7019: Schedule EC as remote & disable failed tests This patch schedules HDFS EC files without considering locality. Failed tests are disabled and a jenkins build should succeed with export ERASURE_COINDG=true. Testing: It passes core tests. Cherry-picks: not for 2.x. Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 Reviewed-on: http://gerrit.cloudera.org:8080/10413 Reviewed-by: Taras BobrovytskyTested-by: Impala Public Jenkins --- M common/fbs/CatalogObjects.fbs M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M tests/common/skip.py M tests/custom_cluster/test_admission_controller.py M tests/custom_cluster/test_hdfs_fd_caching.py M tests/metadata/test_explain.py M tests/query_test/test_hdfs_caching.py M tests/query_test/test_insert.py M tests/query_test/test_insert_parquet.py M tests/query_test/test_mt_dop.py M tests/query_test/test_nested_types.py M tests/query_test/test_queries.py M tests/query_test/test_query_mem_limit.py M tests/query_test/test_scanners.py M tests/util/filesystem_utils.py 17 files changed, 75 insertions(+), 28 deletions(-) Approvals: Taras Bobrovytsky: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/10413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 Gerrit-Change-Number: 10413 Gerrit-PatchSet: 6 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tianyi Wang
[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10413 ) Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/10413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 Gerrit-Change-Number: 10413 Gerrit-PatchSet: 5 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Tue, 22 May 2018 01:10:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10413 ) Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2519/ -- To view, visit http://gerrit.cloudera.org:8080/10413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 Gerrit-Change-Number: 10413 Gerrit-PatchSet: 5 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Mon, 21 May 2018 21:43:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests
Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/10413 ) Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/10413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 Gerrit-Change-Number: 10413 Gerrit-PatchSet: 5 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Mon, 21 May 2018 21:33:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests
Tianyi Wang has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/10413 ) Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests .. IMPALA-7019: Schedule EC as remote & disable failed tests This patch schedules HDFS EC files without considering locality. Failed tests are disabled and a jenkins build should succeed with export ERASURE_COINDG=true. Testing: It passes core tests. Cherry-picks: not for 2.x. Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 --- M common/fbs/CatalogObjects.fbs M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M tests/common/skip.py M tests/custom_cluster/test_admission_controller.py M tests/custom_cluster/test_hdfs_fd_caching.py M tests/metadata/test_explain.py M tests/query_test/test_hdfs_caching.py M tests/query_test/test_insert.py M tests/query_test/test_insert_parquet.py M tests/query_test/test_mt_dop.py M tests/query_test/test_nested_types.py M tests/query_test/test_queries.py M tests/query_test/test_query_mem_limit.py M tests/query_test/test_scanners.py M tests/util/filesystem_utils.py 17 files changed, 75 insertions(+), 28 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/13/10413/5 -- To view, visit http://gerrit.cloudera.org:8080/10413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 Gerrit-Change-Number: 10413 Gerrit-PatchSet: 5 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tianyi Wang
[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests
Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/10413 ) Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/10413/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/10413/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@129 PS4, Line 129: static FileDescriptor createEc(FileStatus fileStatus, BlockLocation[] blockLocations, The logic here seems almost identical to the create() function. Maybe modify the create() function to accept the isEC argument instead of creating a new function? -- To view, visit http://gerrit.cloudera.org:8080/10413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 Gerrit-Change-Number: 10413 Gerrit-PatchSet: 4 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Fri, 18 May 2018 23:05:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests
Tianyi Wang has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/10413 ) Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests .. IMPALA-7019: Schedule EC as remote & disable failed tests This patch schedules HDFS EC files without considering locality. Failed tests are disabled and a jenkins build should succeed with export ERASURE_COINDG=true. Testing: It passes core tests. Cherry-picks: not for 2.x. Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 --- M common/fbs/CatalogObjects.fbs M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M tests/common/skip.py M tests/custom_cluster/test_admission_controller.py M tests/custom_cluster/test_hdfs_fd_caching.py M tests/metadata/test_explain.py M tests/query_test/test_hdfs_caching.py M tests/query_test/test_insert.py M tests/query_test/test_insert_parquet.py M tests/query_test/test_mt_dop.py M tests/query_test/test_nested_types.py M tests/query_test/test_queries.py M tests/query_test/test_query_mem_limit.py M tests/query_test/test_scanners.py M tests/util/filesystem_utils.py 17 files changed, 88 insertions(+), 20 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/13/10413/4 -- To view, visit http://gerrit.cloudera.org:8080/10413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 Gerrit-Change-Number: 10413 Gerrit-PatchSet: 4 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tianyi Wang
[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests
Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/10413 ) Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests .. Patch Set 3: (4 comments) http://gerrit.cloudera.org:8080/#/c/10413/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10413/3//COMMIT_MSG@12 PS3, Line 12: Mention how this patch was tested. Did you run the exhaustive build? http://gerrit.cloudera.org:8080/#/c/10413/3/tests/common/skip.py File tests/common/skip.py: http://gerrit.cloudera.org:8080/#/c/10413/3/tests/common/skip.py@29 PS3, Line 29: IS_ISILON, : IS_LOCAL, : IS_HDFS, : IS_S3, : IS_ADLS, : SECONDARY_FILESYSTEM, : IS_EC This should be sorted alphabetically http://gerrit.cloudera.org:8080/#/c/10413/3/tests/common/skip.py@150 PS3, Line 150: doesn't work. "do not work" http://gerrit.cloudera.org:8080/#/c/10413/3/tests/util/filesystem_utils.py File tests/util/filesystem_utils.py: http://gerrit.cloudera.org:8080/#/c/10413/3/tests/util/filesystem_utils.py@33 PS3, Line 33: os.getenv("ERASURE_CODING") We want to check if ERASURE_CODING == true here. -- To view, visit http://gerrit.cloudera.org:8080/10413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 Gerrit-Change-Number: 10413 Gerrit-PatchSet: 3 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Fri, 18 May 2018 21:57:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests
Tianyi Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/10413 ) Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests .. Patch Set 3: Using the original block length/offset breaks more tests. I disabled them. -- To view, visit http://gerrit.cloudera.org:8080/10413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 Gerrit-Change-Number: 10413 Gerrit-PatchSet: 3 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Fri, 18 May 2018 18:50:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests
Tianyi Wang has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/10413 ) Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests .. IMPALA-7019: Schedule EC as remote & disable failed tests This patch schedules HDFS EC files without considering locality. Failed tests are disabled and a jenkins build should succeed with export ERASURE_COINDG=true. Cherry-picks: not for 2.x. Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 --- M common/fbs/CatalogObjects.fbs M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M tests/common/skip.py M tests/custom_cluster/test_admission_controller.py M tests/custom_cluster/test_hdfs_fd_caching.py M tests/metadata/test_explain.py M tests/query_test/test_hdfs_caching.py M tests/query_test/test_insert.py M tests/query_test/test_insert_parquet.py M tests/query_test/test_mt_dop.py M tests/query_test/test_nested_types.py M tests/query_test/test_queries.py M tests/query_test/test_query_mem_limit.py M tests/query_test/test_scanners.py M tests/util/filesystem_utils.py 17 files changed, 87 insertions(+), 19 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/13/10413/3 -- To view, visit http://gerrit.cloudera.org:8080/10413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 Gerrit-Change-Number: 10413 Gerrit-PatchSet: 3 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Tianyi Wang
[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests
Tianyi Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/10413 ) Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/10413/1/tests/custom_cluster/test_hdfs_fd_caching.py File tests/custom_cluster/test_hdfs_fd_caching.py: http://gerrit.cloudera.org:8080/#/c/10413/1/tests/custom_cluster/test_hdfs_fd_caching.py@29 PS1, Line 29: @SkipIfEC.remote_read > Is this really expected to work? FD caching is only applied to local sc rea Thanks for pointing it out. I changed it to SkipIfEC.remote_read -- To view, visit http://gerrit.cloudera.org:8080/10413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 Gerrit-Change-Number: 10413 Gerrit-PatchSet: 2 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Wed, 16 May 2018 20:17:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests
Tianyi Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/10413 ) Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests .. Patch Set 2: (4 comments) http://gerrit.cloudera.org:8080/#/c/10413/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/10413/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@439 PS1, Line 439: // the block location API. > Shouldn't this check be in L424 so that synthesizeFileMd is true for EC? Ar - Files are enumerated at L430. L424 is in partition scope. - In the latest patch set only the replica is substituted with a remote address. Block offset and length are preserved. - Done http://gerrit.cloudera.org:8080/#/c/10413/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@510 PS1, Line 510: continue; > move check to L488? Files are enumerated at L502. L488 is in partition scope. http://gerrit.cloudera.org:8080/#/c/10413/1/tests/common/skip.py File tests/common/skip.py: http://gerrit.cloudera.org:8080/#/c/10413/1/tests/common/skip.py@149 PS1, Line 149: remote_read = pytest.mark.skipif(IS_EC, reason="EC files are read remotely and " > Any more concrete reason, or are there too many to list? I haven't scrutinized all the failed tests yet. I think this list will grow later. http://gerrit.cloudera.org:8080/#/c/10413/1/tests/query_test/test_mt_dop.py File tests/query_test/test_mt_dop.py: http://gerrit.cloudera.org:8080/#/c/10413/1/tests/query_test/test_mt_dop.py@101 PS1, Line 101: # Impala scans fewer row groups than it should with erasure coding. > fewer row groups Done -- To view, visit http://gerrit.cloudera.org:8080/10413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 Gerrit-Change-Number: 10413 Gerrit-PatchSet: 2 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Wed, 16 May 2018 20:16:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests
Tianyi Wang has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/10413 ) Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests .. IMPALA-7019: Schedule EC as remote & disable failed tests This patch schedules HDFS EC files without considering locality. Failed tests are disabled and a jenkins build should succeed with export ERASURE_COINDG=true. Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 --- M common/fbs/CatalogObjects.fbs M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M tests/common/skip.py M tests/custom_cluster/test_hdfs_fd_caching.py M tests/query_test/test_hdfs_caching.py M tests/query_test/test_insert.py M tests/query_test/test_insert_parquet.py M tests/query_test/test_mt_dop.py M tests/query_test/test_nested_types.py M tests/query_test/test_queries.py M tests/query_test/test_query_mem_limit.py M tests/query_test/test_scanners.py M tests/util/filesystem_utils.py 15 files changed, 79 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/13/10413/2 -- To view, visit http://gerrit.cloudera.org:8080/10413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 Gerrit-Change-Number: 10413 Gerrit-PatchSet: 2 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm
[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/10413 ) Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests .. Patch Set 1: (5 comments) http://gerrit.cloudera.org:8080/#/c/10413/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/10413/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@439 PS1, Line 439: if (synthesizeFileMd || fileStatus.isErasureCoded()) { Shouldn't this check be in L424 so that synthesizeFileMd is true for EC? Are they EC blocks really completely useless to us and we're better of synthesizing them? I think it's worth adding a comment explaining why we chose this path for EC. http://gerrit.cloudera.org:8080/#/c/10413/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@510 PS1, Line 510: if (synthesizeFileMd || fileStatus.isErasureCoded()) { move check to L488? http://gerrit.cloudera.org:8080/#/c/10413/1/tests/common/skip.py File tests/common/skip.py: http://gerrit.cloudera.org:8080/#/c/10413/1/tests/common/skip.py@149 PS1, Line 149: ec = pytest.mark.skipif(IS_EC, reason="It shouldn't work with erasure coding.") Any more concrete reason, or are there too many to list? http://gerrit.cloudera.org:8080/#/c/10413/1/tests/custom_cluster/test_hdfs_fd_caching.py File tests/custom_cluster/test_hdfs_fd_caching.py: http://gerrit.cloudera.org:8080/#/c/10413/1/tests/custom_cluster/test_hdfs_fd_caching.py@29 PS1, Line 29: @SkipIfEC.fix_later Is this really expected to work? FD caching is only applied to local sc reads, and not used for remote reads. http://gerrit.cloudera.org:8080/#/c/10413/1/tests/query_test/test_mt_dop.py File tests/query_test/test_mt_dop.py: http://gerrit.cloudera.org:8080/#/c/10413/1/tests/query_test/test_mt_dop.py@101 PS1, Line 101: # Impala scans less row group than it should with erasure coding. fewer row groups -- To view, visit http://gerrit.cloudera.org:8080/10413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 Gerrit-Change-Number: 10413 Gerrit-PatchSet: 1 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Comment-Date: Tue, 15 May 2018 20:43:35 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests
Tianyi Wang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10413 Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests .. IMPALA-7019: Schedule EC as remote & disable failed tests This patch schedules HDFS EC files without considering locality. Failed tests are disabled and a jenkins build should succeed with export ERASURE_COINDG=true. Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 --- M common/fbs/CatalogObjects.fbs M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M tests/common/skip.py M tests/custom_cluster/test_hdfs_fd_caching.py M tests/query_test/test_hdfs_caching.py M tests/query_test/test_insert.py M tests/query_test/test_insert_parquet.py M tests/query_test/test_mt_dop.py M tests/query_test/test_nested_types.py M tests/query_test/test_queries.py M tests/query_test/test_query_mem_limit.py M tests/query_test/test_scanners.py M tests/util/filesystem_utils.py 15 files changed, 48 insertions(+), 15 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/13/10413/1 -- To view, visit http://gerrit.cloudera.org:8080/10413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84 Gerrit-Change-Number: 10413 Gerrit-PatchSet: 1 Gerrit-Owner: Tianyi Wang