[
https://issues.apache.org/jira/browse/SQOOP-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15430980#comment-15430980
]
ASF GitHub Bot commented on SQOOP-3002:
---------------------------------------
Github user kevin00chen commented on a diff in the pull request:
https://github.com/apache/sqoop/pull/26#discussion_r75698166
--- Diff: src/java/org/apache/sqoop/mapreduce/MergeMapperBase.java ---
@@ -76,9 +76,10 @@ protected void processRecord(SqoopRecord r, Context c)
}
Object keyObj = null;
if (keyColName.contains(",")) {
+ String connectStr = new String(new byte[]{1});
StringBuilder keyFieldsSb = new StringBuilder();
for (String str : keyColName.split(",")) {
- keyFieldsSb.append("+").append(fieldMap.get(str).toString());
+
keyFieldsSb.append(connectStr).append(fieldMap.get(str).toString());
--- End diff --
for example one table has two column, a and b
Field a | Field b
------------ | -------------
a+ | b
a | +b
when use "+" to connect two field, two record will has same keyObj.
To avoid this i use a String contains one byte.
> Sqoop Merge Tool support composite merge-key
> --------------------------------------------
>
> Key: SQOOP-3002
> URL: https://issues.apache.org/jira/browse/SQOOP-3002
> Project: Sqoop
> Issue Type: Improvement
> Components: hive-integration
> Affects Versions: 1.4.5, 1.4.6, 1.99.5, 1.99.7
> Reporter: KaimingChen
>
> When i use sqoop merge tool, i can just specify one column using --merge-key
> arguement.
> But when my table has composite keys, i use --merge-key column1,column2 then
> i got an Exception:
> 16/08/22 15:54:15 INFO mapreduce.Job: Task Id :
> attempt_1470135750174_2508_m_000004_2, Status : FAILED
> Error: java.io.IOException: Cannot join values on null key. Did you specify a
> key column that exists?
> at
> org.apache.sqoop.mapreduce.MergeMapperBase.processRecord(MergeMapperBase.java:79)
> at
> org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:58)
> at
> org.apache.sqoop.mapreduce.MergeTextMapper.map(MergeTextMapper.java:34)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)