[ https://issues.apache.org/jira/browse/IMPALA-5931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16933615#comment-16933615 ]
ASF subversion and git services commented on IMPALA-5931: --------------------------------------------------------- Commit feed25084a999fe0a4e7b58b5264fce5829c43e7 in impala's branch refs/heads/master from stakiar [ https://gitbox.apache.org/repos/asf?p=impala.git;h=feed250 ] IMPALA-8944: Update and re-enable S3PlannerTest Addresses several test infra issues that were preventing the S3PlannerTest from running successfully. Disables a few tests that are no longer working, and removes some planner checks that are no longer applicable when running on S3. Specifically, this patch removes the checks in PlannerTestBase#checkScanRangeLocations when running against S3, because the planner no longer generates scan ranges; generation is deferred to the scheduler (IMPALA-5931). Replaces the old logic of specifying S3-specific fe/ tests with a combination of JUnit Categories and Maven Profiles. The previous method was broken and assumed that all S3-specific fe/ tests started with S3*. The new approach removes that restriction and only requires S3-specific JUnit tests to be tagged with the Java annotation '@Category(S3Tests.class)' (entire classes or individual tests can be tagged with the annotation). Testing: * Ran fe/ tests with TARGET_FILESYSTEM=s3 Change-Id: I1690b6c5346376c1111fd4845c72062cc237e0f9 Reviewed-on: http://gerrit.cloudera.org:8080/14248 Reviewed-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> > Don't synthesize block metadata in the catalog for S3/ADLS > ---------------------------------------------------------- > > Key: IMPALA-5931 > URL: https://issues.apache.org/jira/browse/IMPALA-5931 > Project: IMPALA > Issue Type: Improvement > Components: Catalog > Reporter: Dan Hecht > Assignee: Vuk Ercegovac > Priority: Major > Fix For: Impala 2.13.0, Impala 3.1.0 > > > Today, the catalog synthesizes block metadata for S3/ADLS by just breaking up > splittable files into "blocks" with the FileSystem's default block size. > Rather than carrying these blocks around in the catalog and distributing them > to all impalad's, we might as well generate the scan ranges on-the-fly during > planning. That would save the memory and network bandwidth of blocks. > That does mean that the planner will have to instantiate and call the > filesystem to get the default block size, but for these FileSystem's, that's > just a matter of reading the config. > Perhaps the same can be done for HDFS erasure coding, though that depends on > what a block location actually means in that context and whether they contain > useful info. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org