[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests

2018-05-22 Thread Tianyi Wang (Code Review)
Tianyi Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10413 )

Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10413/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/10413/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@118
PS6, Line 118: FileBlock.createFbFileBlock(fbb,
 :   loc.getOffset(), loc.getLength(),
 :   (short) 
hostIndex.getIndex(REMOTE_NETWORK_ADDRESS));
> I can try changing it to generating scan ranges in the backend later. I thi
Sorry I didn't see your last comment. I didn't saw to L186...  Then there 
shouldn't be a difference between synthesizing and using the existing block 
locations. I will do the backend-scanrange adoption



--
To view, visit http://gerrit.cloudera.org:8080/10413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
Gerrit-Change-Number: 10413
Gerrit-PatchSet: 6
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Tue, 22 May 2018 18:34:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests

2018-05-22 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10413 )

Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10413/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/10413/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@118
PS6, Line 118: FileBlock.createFbFileBlock(fbb,
 :   loc.getOffset(), loc.getLength(),
 :   (short) 
hostIndex.getIndex(REMOTE_NETWORK_ADDRESS));
> The reason is not strong: Though EC reads are remote, we might still don't
ok, so you were worried that synthetic division into blocks could differ from 
how a file is chunked into blocks on hdfs. Assuming the block size provided by 
FileStatus never changes (L186), when would you see that the two different 
chunking schemes differ?



--
To view, visit http://gerrit.cloudera.org:8080/10413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
Gerrit-Change-Number: 10413
Gerrit-PatchSet: 6
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Tue, 22 May 2018 18:27:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests

2018-05-22 Thread Tianyi Wang (Code Review)
Tianyi Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10413 )

Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10413/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/10413/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@118
PS6, Line 118: FileBlock.createFbFileBlock(fbb,
 :   loc.getOffset(), loc.getLength(),
 :   (short) 
hostIndex.getIndex(REMOTE_NETWORK_ADDRESS));
> The reason is not strong: Though EC reads are remote, we might still don't
I can try changing it to generating scan ranges in the backend later. I think 
it's fine to leave it as it is now and merge your change.



--
To view, visit http://gerrit.cloudera.org:8080/10413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
Gerrit-Change-Number: 10413
Gerrit-PatchSet: 6
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Tue, 22 May 2018 18:27:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests

2018-05-22 Thread Tianyi Wang (Code Review)
Tianyi Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10413 )

Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10413/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/10413/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@118
PS6, Line 118: FileBlock.createFbFileBlock(fbb,
 :   loc.getOffset(), loc.getLength(),
 :   (short) 
hostIndex.getIndex(REMOTE_NETWORK_ADDRESS));
> I know this is merged, but just wanted to revisit why we're storing block l
The reason is not strong: Though EC reads are remote, we might still don't want 
a block to be read as 2 blocks. Reading across block boundary might lead to 
more connections to name nodes and data nodes.



--
To view, visit http://gerrit.cloudera.org:8080/10413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
Gerrit-Change-Number: 10413
Gerrit-PatchSet: 6
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Tue, 22 May 2018 18:17:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests

2018-05-22 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10413 )

Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10413/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/10413/6/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@118
PS6, Line 118: FileBlock.createFbFileBlock(fbb,
 :   loc.getOffset(), loc.getLength(),
 :   (short) 
hostIndex.getIndex(REMOTE_NETWORK_ADDRESS));
I know this is merged, but just wanted to revisit why we're storing block 
locations for ec files this way rather than synthesizing them via L137?



--
To view, visit http://gerrit.cloudera.org:8080/10413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
Gerrit-Change-Number: 10413
Gerrit-PatchSet: 6
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Tue, 22 May 2018 17:58:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests

2018-05-21 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/10413 )

Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests
..

IMPALA-7019: Schedule EC as remote & disable failed tests

This patch schedules HDFS EC files without considering locality. Failed
tests are disabled and a jenkins build should succeed with export
ERASURE_COINDG=true.

Testing: It passes core tests.

Cherry-picks: not for 2.x.

Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
Reviewed-on: http://gerrit.cloudera.org:8080/10413
Reviewed-by: Taras Bobrovytsky 
Tested-by: Impala Public Jenkins 
---
M common/fbs/CatalogObjects.fbs
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M tests/common/skip.py
M tests/custom_cluster/test_admission_controller.py
M tests/custom_cluster/test_hdfs_fd_caching.py
M tests/metadata/test_explain.py
M tests/query_test/test_hdfs_caching.py
M tests/query_test/test_insert.py
M tests/query_test/test_insert_parquet.py
M tests/query_test/test_mt_dop.py
M tests/query_test/test_nested_types.py
M tests/query_test/test_queries.py
M tests/query_test/test_query_mem_limit.py
M tests/query_test/test_scanners.py
M tests/util/filesystem_utils.py
17 files changed, 75 insertions(+), 28 deletions(-)

Approvals:
  Taras Bobrovytsky: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/10413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
Gerrit-Change-Number: 10413
Gerrit-PatchSet: 6
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tianyi Wang 


[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests

2018-05-21 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10413 )

Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests
..


Patch Set 5: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
Gerrit-Change-Number: 10413
Gerrit-PatchSet: 5
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Tue, 22 May 2018 01:10:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests

2018-05-21 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10413 )

Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2519/


--
To view, visit http://gerrit.cloudera.org:8080/10413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
Gerrit-Change-Number: 10413
Gerrit-PatchSet: 5
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Mon, 21 May 2018 21:43:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests

2018-05-21 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10413 )

Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/10413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
Gerrit-Change-Number: 10413
Gerrit-PatchSet: 5
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Mon, 21 May 2018 21:33:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests

2018-05-21 Thread Tianyi Wang (Code Review)
Tianyi Wang has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/10413 )

Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests
..

IMPALA-7019: Schedule EC as remote & disable failed tests

This patch schedules HDFS EC files without considering locality. Failed
tests are disabled and a jenkins build should succeed with export
ERASURE_COINDG=true.

Testing: It passes core tests.

Cherry-picks: not for 2.x.

Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
---
M common/fbs/CatalogObjects.fbs
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M tests/common/skip.py
M tests/custom_cluster/test_admission_controller.py
M tests/custom_cluster/test_hdfs_fd_caching.py
M tests/metadata/test_explain.py
M tests/query_test/test_hdfs_caching.py
M tests/query_test/test_insert.py
M tests/query_test/test_insert_parquet.py
M tests/query_test/test_mt_dop.py
M tests/query_test/test_nested_types.py
M tests/query_test/test_queries.py
M tests/query_test/test_query_mem_limit.py
M tests/query_test/test_scanners.py
M tests/util/filesystem_utils.py
17 files changed, 75 insertions(+), 28 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/13/10413/5
--
To view, visit http://gerrit.cloudera.org:8080/10413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
Gerrit-Change-Number: 10413
Gerrit-PatchSet: 5
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tianyi Wang 


[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests

2018-05-18 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10413 )

Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10413/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/10413/4/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@129
PS4, Line 129: static FileDescriptor createEc(FileStatus fileStatus, 
BlockLocation[] blockLocations,
The logic here seems almost identical to the create() function. Maybe modify 
the create() function to accept the isEC argument instead of creating a new 
function?



--
To view, visit http://gerrit.cloudera.org:8080/10413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
Gerrit-Change-Number: 10413
Gerrit-PatchSet: 4
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Fri, 18 May 2018 23:05:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests

2018-05-18 Thread Tianyi Wang (Code Review)
Tianyi Wang has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/10413 )

Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests
..

IMPALA-7019: Schedule EC as remote & disable failed tests

This patch schedules HDFS EC files without considering locality. Failed
tests are disabled and a jenkins build should succeed with export
ERASURE_COINDG=true.

Testing: It passes core tests.

Cherry-picks: not for 2.x.

Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
---
M common/fbs/CatalogObjects.fbs
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M tests/common/skip.py
M tests/custom_cluster/test_admission_controller.py
M tests/custom_cluster/test_hdfs_fd_caching.py
M tests/metadata/test_explain.py
M tests/query_test/test_hdfs_caching.py
M tests/query_test/test_insert.py
M tests/query_test/test_insert_parquet.py
M tests/query_test/test_mt_dop.py
M tests/query_test/test_nested_types.py
M tests/query_test/test_queries.py
M tests/query_test/test_query_mem_limit.py
M tests/query_test/test_scanners.py
M tests/util/filesystem_utils.py
17 files changed, 88 insertions(+), 20 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/13/10413/4
--
To view, visit http://gerrit.cloudera.org:8080/10413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
Gerrit-Change-Number: 10413
Gerrit-PatchSet: 4
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tianyi Wang 


[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests

2018-05-18 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10413 )

Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests
..


Patch Set 3:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/10413/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10413/3//COMMIT_MSG@12
PS3, Line 12:
Mention how this patch was tested. Did you run the exhaustive build?


http://gerrit.cloudera.org:8080/#/c/10413/3/tests/common/skip.py
File tests/common/skip.py:

http://gerrit.cloudera.org:8080/#/c/10413/3/tests/common/skip.py@29
PS3, Line 29: IS_ISILON,
: IS_LOCAL,
: IS_HDFS,
: IS_S3,
: IS_ADLS,
: SECONDARY_FILESYSTEM,
: IS_EC
This should be sorted alphabetically


http://gerrit.cloudera.org:8080/#/c/10413/3/tests/common/skip.py@150
PS3, Line 150: doesn't work.
"do not work"


http://gerrit.cloudera.org:8080/#/c/10413/3/tests/util/filesystem_utils.py
File tests/util/filesystem_utils.py:

http://gerrit.cloudera.org:8080/#/c/10413/3/tests/util/filesystem_utils.py@33
PS3, Line 33: os.getenv("ERASURE_CODING")
We want to check if ERASURE_CODING == true here.



--
To view, visit http://gerrit.cloudera.org:8080/10413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
Gerrit-Change-Number: 10413
Gerrit-PatchSet: 3
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Fri, 18 May 2018 21:57:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests

2018-05-18 Thread Tianyi Wang (Code Review)
Tianyi Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10413 )

Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests
..


Patch Set 3:

Using the original block length/offset breaks more tests. I disabled them.


--
To view, visit http://gerrit.cloudera.org:8080/10413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
Gerrit-Change-Number: 10413
Gerrit-PatchSet: 3
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Fri, 18 May 2018 18:50:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests

2018-05-18 Thread Tianyi Wang (Code Review)
Tianyi Wang has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/10413 )

Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests
..

IMPALA-7019: Schedule EC as remote & disable failed tests

This patch schedules HDFS EC files without considering locality. Failed
tests are disabled and a jenkins build should succeed with export
ERASURE_COINDG=true.

Cherry-picks: not for 2.x.

Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
---
M common/fbs/CatalogObjects.fbs
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M tests/common/skip.py
M tests/custom_cluster/test_admission_controller.py
M tests/custom_cluster/test_hdfs_fd_caching.py
M tests/metadata/test_explain.py
M tests/query_test/test_hdfs_caching.py
M tests/query_test/test_insert.py
M tests/query_test/test_insert_parquet.py
M tests/query_test/test_mt_dop.py
M tests/query_test/test_nested_types.py
M tests/query_test/test_queries.py
M tests/query_test/test_query_mem_limit.py
M tests/query_test/test_scanners.py
M tests/util/filesystem_utils.py
17 files changed, 87 insertions(+), 19 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/13/10413/3
--
To view, visit http://gerrit.cloudera.org:8080/10413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
Gerrit-Change-Number: 10413
Gerrit-PatchSet: 3
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Tianyi Wang 


[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests

2018-05-16 Thread Tianyi Wang (Code Review)
Tianyi Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10413 )

Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10413/1/tests/custom_cluster/test_hdfs_fd_caching.py
File tests/custom_cluster/test_hdfs_fd_caching.py:

http://gerrit.cloudera.org:8080/#/c/10413/1/tests/custom_cluster/test_hdfs_fd_caching.py@29
PS1, Line 29: @SkipIfEC.remote_read
> Is this really expected to work? FD caching is only applied to local sc rea
Thanks for pointing it out. I changed it to SkipIfEC.remote_read



--
To view, visit http://gerrit.cloudera.org:8080/10413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
Gerrit-Change-Number: 10413
Gerrit-PatchSet: 2
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Wed, 16 May 2018 20:17:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests

2018-05-16 Thread Tianyi Wang (Code Review)
Tianyi Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10413 )

Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests
..


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/10413/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/10413/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@439
PS1, Line 439: // the block location API.
> Shouldn't this check be in L424 so that synthesizeFileMd is true for EC? Ar
- Files are enumerated at L430. L424 is in partition scope.
- In the latest patch set only the replica is substituted with a remote 
address. Block offset and length are preserved.
- Done


http://gerrit.cloudera.org:8080/#/c/10413/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@510
PS1, Line 510: continue;
> move check to L488?
Files are enumerated at L502. L488 is in partition scope.


http://gerrit.cloudera.org:8080/#/c/10413/1/tests/common/skip.py
File tests/common/skip.py:

http://gerrit.cloudera.org:8080/#/c/10413/1/tests/common/skip.py@149
PS1, Line 149:   remote_read = pytest.mark.skipif(IS_EC, reason="EC files are 
read remotely and "
> Any more concrete reason, or are there too many to list?
I haven't scrutinized all the failed tests yet. I think this list will grow 
later.


http://gerrit.cloudera.org:8080/#/c/10413/1/tests/query_test/test_mt_dop.py
File tests/query_test/test_mt_dop.py:

http://gerrit.cloudera.org:8080/#/c/10413/1/tests/query_test/test_mt_dop.py@101
PS1, Line 101:   # Impala scans fewer row groups than it should with erasure 
coding.
> fewer row groups
Done



--
To view, visit http://gerrit.cloudera.org:8080/10413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
Gerrit-Change-Number: 10413
Gerrit-PatchSet: 2
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Comment-Date: Wed, 16 May 2018 20:16:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests

2018-05-16 Thread Tianyi Wang (Code Review)
Tianyi Wang has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/10413 )

Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests
..

IMPALA-7019: Schedule EC as remote & disable failed tests

This patch schedules HDFS EC files without considering locality. Failed
tests are disabled and a jenkins build should succeed with export
ERASURE_COINDG=true.

Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
---
M common/fbs/CatalogObjects.fbs
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M tests/common/skip.py
M tests/custom_cluster/test_hdfs_fd_caching.py
M tests/query_test/test_hdfs_caching.py
M tests/query_test/test_insert.py
M tests/query_test/test_insert_parquet.py
M tests/query_test/test_mt_dop.py
M tests/query_test/test_nested_types.py
M tests/query_test/test_queries.py
M tests/query_test/test_query_mem_limit.py
M tests/query_test/test_scanners.py
M tests/util/filesystem_utils.py
15 files changed, 79 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/13/10413/2
--
To view, visit http://gerrit.cloudera.org:8080/10413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
Gerrit-Change-Number: 10413
Gerrit-PatchSet: 2
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 


[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests

2018-05-15 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10413 )

Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests
..


Patch Set 1:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/10413/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/10413/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@439
PS1, Line 439:   if (synthesizeFileMd || fileStatus.isErasureCoded()) {
Shouldn't this check be in L424 so that synthesizeFileMd is true for EC? Are 
they EC blocks really completely useless to us and we're better of synthesizing 
them?

I think it's worth adding a comment explaining why we chose this path for EC.


http://gerrit.cloudera.org:8080/#/c/10413/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@510
PS1, Line 510: if (synthesizeFileMd || fileStatus.isErasureCoded()) {
move check to L488?


http://gerrit.cloudera.org:8080/#/c/10413/1/tests/common/skip.py
File tests/common/skip.py:

http://gerrit.cloudera.org:8080/#/c/10413/1/tests/common/skip.py@149
PS1, Line 149:   ec = pytest.mark.skipif(IS_EC, reason="It shouldn't work with 
erasure coding.")
Any more concrete reason, or are there too many to list?


http://gerrit.cloudera.org:8080/#/c/10413/1/tests/custom_cluster/test_hdfs_fd_caching.py
File tests/custom_cluster/test_hdfs_fd_caching.py:

http://gerrit.cloudera.org:8080/#/c/10413/1/tests/custom_cluster/test_hdfs_fd_caching.py@29
PS1, Line 29: @SkipIfEC.fix_later
Is this really expected to work? FD caching is only applied to local sc reads, 
and not used for remote reads.


http://gerrit.cloudera.org:8080/#/c/10413/1/tests/query_test/test_mt_dop.py
File tests/query_test/test_mt_dop.py:

http://gerrit.cloudera.org:8080/#/c/10413/1/tests/query_test/test_mt_dop.py@101
PS1, Line 101:   # Impala scans less row group than it should with erasure 
coding.
fewer row groups



--
To view, visit http://gerrit.cloudera.org:8080/10413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
Gerrit-Change-Number: 10413
Gerrit-PatchSet: 1
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Comment-Date: Tue, 15 May 2018 20:43:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7019: Schedule EC as remote & disable failed tests

2018-05-15 Thread Tianyi Wang (Code Review)
Tianyi Wang has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10413


Change subject: IMPALA-7019: Schedule EC as remote & disable failed tests
..

IMPALA-7019: Schedule EC as remote & disable failed tests

This patch schedules HDFS EC files without considering locality. Failed
tests are disabled and a jenkins build should succeed with export
ERASURE_COINDG=true.

Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
---
M common/fbs/CatalogObjects.fbs
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M tests/common/skip.py
M tests/custom_cluster/test_hdfs_fd_caching.py
M tests/query_test/test_hdfs_caching.py
M tests/query_test/test_insert.py
M tests/query_test/test_insert_parquet.py
M tests/query_test/test_mt_dop.py
M tests/query_test/test_nested_types.py
M tests/query_test/test_queries.py
M tests/query_test/test_query_mem_limit.py
M tests/query_test/test_scanners.py
M tests/util/filesystem_utils.py
15 files changed, 48 insertions(+), 15 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/13/10413/1
--
To view, visit http://gerrit.cloudera.org:8080/10413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I138738d3e28e5daa1718c05c04cd9dd146c4ff84
Gerrit-Change-Number: 10413
Gerrit-PatchSet: 1
Gerrit-Owner: Tianyi Wang