[ https://issues.apache.org/jira/browse/HIVE-13217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202608#comment-15202608 ]
Szehon Ho commented on HIVE-13217: ---------------------------------- Sorry you are right, it is min(), latest patch looks good to me +1. Thanks! > Replication for HoS mapjoin small file needs to respect dfs.replication.max > --------------------------------------------------------------------------- > > Key: HIVE-13217 > URL: https://issues.apache.org/jira/browse/HIVE-13217 > Project: Hive > Issue Type: Bug > Components: Spark > Affects Versions: 1.2.1, 2.0.0 > Reporter: Szehon Ho > Assignee: Chinna Rao Lalam > Priority: Minor > Attachments: HIVE-13217.1.patch, HIVE-13217.2.patch > > > Currently Hive on Spark Mapjoin replicates small table file to a hard-coded > value of 10. See SparkHashTableSinkOperator.MIN_REPLICATION. > When dfs.replication.max is less than 10, HoS query fails. This constant > should cap at dfs.replication.max. > Normally dfs.replication.max seems set at 512. -- This message was sent by Atlassian JIRA (v6.3.4#6332)