[ https://issues.apache.org/jira/browse/HIVE-17292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Peter Vary updated HIVE-17292: ------------------------------ Attachment: HIVE-17292.5.patch Well, this patch become huge :( The actual code/configuration change is minimal: - QTestUtil.java - to check for 4 cores before allowing to run a query - SparkSessionImpl.java - to use the same method to calculating cores was with spark.master="spark.\*" - Hadoop23Shims.java - to change the scheduler allocation minimum, this way allowing the MiniCluster to create 2 nodes - The others are only q.out changes -- Number of executors 2->4 -- Number of result files are higher because the executor number is higher -- When there is no order by in the query the resulting lines are mixed in some cases (union.q.out, union11.q.out, union14.q.out, union15.q.out, union7.q.out, union_null.q.out) - We might have to apply {{-- SORT_QUERY_RESULTS}} if they become flaky -- The overall size of the result files become bigger (union_remove_10.q.out, union_remove_13.q.out, union_remove_15.q.out, union_remove_16.q.out, union_remove_7.q.out, union_remove_8.q.out, union_remove_9.q.out) - I think the number of the files, and the overhead of the RCFileOutputFormat causes this issue - spark_dynamic_partition_pruning_mapjoin_only.q.out is changed - See: HIVE-16948 What do you think about this change [~lirui]? Shall we bite the bullet, and review/commit it - do we have a good way to validate the changes? Or shall we chicken out, and change the configuration back to use only 1 executor with 2 cores, and then only configuration change is needed? > Change TestMiniSparkOnYarnCliDriver test configuration to use the configured > cores > ---------------------------------------------------------------------------------- > > Key: HIVE-17292 > URL: https://issues.apache.org/jira/browse/HIVE-17292 > Project: Hive > Issue Type: Sub-task > Components: Spark, Test > Affects Versions: 3.0.0 > Reporter: Peter Vary > Assignee: Peter Vary > Attachments: HIVE-17292.1.patch, HIVE-17292.2.patch, > HIVE-17292.3.patch, HIVE-17292.5.patch > > > Currently the {{hive-site.xml}} for the {{TestMiniSparkOnYarnCliDriver}} test > defines 2 cores, and 2 executors, but only 1 is used, because the MiniCluster > does not allows the creation of the 3rd container. > The FairScheduler uses 1GB increments for memory, but the containers would > like to use only 512MB. We should change the fairscheduler configuration to > use only the requested 512MB -- This message was sent by Atlassian JIRA (v6.4.14#64029)