[ https://issues.apache.org/jira/browse/SPARK-16985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15414644#comment-15414644 ]
Hong Shen commented on SPARK-16985: ----------------------------------- The reason is the output file use SimpleDateFormat("yyyyMMddHHmm"), if two sql insert into the same table in the same minute, the output will be overrite. I think we should change dateFormat to "yyyyMMddHHmmss", in our cluster, we can't finished a sql in one second. > SQL Output maybe overrided > -------------------------- > > Key: SPARK-16985 > URL: https://issues.apache.org/jira/browse/SPARK-16985 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.0.0 > Reporter: Hong Shen > > In our cluster, sometimes the sql output maybe overrided. When I submit some > sql, all insert into the same table, and the sql will cost less one minute, > here is the detail, > 1 sql1, 11:03 insert into table. > 2 sql2, 11:04:11 insert into table. > 3 sql3, 11:04:48 insert into table. > 4 sql4, 11:05 insert into table. > 5 sql5, 11:06 insert into table. > The sql3's output file will override the sql2's output file. here is the log: > {code} > 16/05/04 11:04:11 INFO hive.SparkHiveHadoopWriter: > XXfinalPath=hdfs://tl-sng-gdt-nn-tdw.tencent-distribute.com:54310/tmp/assorz/tdw-tdwadmin/20160504/04559505496526517_-1_1204544348/10000/_tmp.p_20160428/attempt_201605041104_0001_m_000000_1 > 16/05/04 11:04:48 INFO hive.SparkHiveHadoopWriter: > XXfinalPath=hdfs://tl-sng-gdt-nn-tdw.tencent-distribute.com:54310/tmp/assorz/tdw-tdwadmin/20160504/04559505496526517_-1_212180468/10000/_tmp.p_20160428/attempt_201605041104_0001_m_000000_1 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org