[ 
https://issues.apache.org/jira/browse/HIVE-17292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-17292:
------------------------------
    Attachment: HIVE-17292.5.patch

Well, this patch become huge :(

The actual code/configuration change is minimal:
- QTestUtil.java - to check for 4 cores before allowing to run a query
- SparkSessionImpl.java - to use the same method to calculating cores was with 
spark.master="spark.\*"
- Hadoop23Shims.java - to change the scheduler allocation minimum, this way 
allowing the MiniCluster to create 2 nodes
- The others are only q.out changes
-- Number of executors 2->4
-- Number of result files are higher because the executor number is higher
-- When there is no order by in the query the resulting lines are mixed in some 
cases (union.q.out, union11.q.out, union14.q.out, union15.q.out, union7.q.out, 
union_null.q.out) - We might have to apply {{-- SORT_QUERY_RESULTS}} if they 
become flaky
-- The overall size of the result files become bigger (union_remove_10.q.out, 
union_remove_13.q.out, union_remove_15.q.out, union_remove_16.q.out, 
union_remove_7.q.out, union_remove_8.q.out, union_remove_9.q.out) - I think the 
number of the files, and the overhead of the RCFileOutputFormat causes this 
issue
- spark_dynamic_partition_pruning_mapjoin_only.q.out is changed - See: 
HIVE-16948

What do you think about this change [~lirui]?
Shall we bite the bullet, and review/commit it - do we have a good way to 
validate the changes?
Or shall we chicken out, and change the configuration back to use only 1 
executor with 2 cores, and then only configuration change is needed?

> Change TestMiniSparkOnYarnCliDriver test configuration to use the configured 
> cores
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-17292
>                 URL: https://issues.apache.org/jira/browse/HIVE-17292
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark, Test
>    Affects Versions: 3.0.0
>            Reporter: Peter Vary
>            Assignee: Peter Vary
>         Attachments: HIVE-17292.1.patch, HIVE-17292.2.patch, 
> HIVE-17292.3.patch, HIVE-17292.5.patch
>
>
> Currently the {{hive-site.xml}} for the {{TestMiniSparkOnYarnCliDriver}} test 
> defines 2 cores, and 2 executors, but only 1 is used, because the MiniCluster 
> does not allows the creation of the 3rd container.
> The FairScheduler uses 1GB increments for memory, but the containers would 
> like to use only 512MB. We should change the fairscheduler configuration to 
> use only the requested 512MB



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to