[GitHub] spark pull request #21558: [SPARK-24552][SQL] Use task ID instead of attempt...

tgravescs Thu, 21 Jun 2018 10:50:25 -0700

Github user tgravescs commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21558#discussion_r197222759
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/WriteToDataSourceV2.scala
 ---
    @@ -110,7 +108,7 @@ object DataWritingSparkTask extends Logging {
           useCommitCoordinator: Boolean): WriterCommitMessage = {
         val stageId = context.stageId()
         val partId = context.partitionId()
    -    val attemptId = context.attemptNumber()
    +    val attemptId = context.taskAttemptId().toInt
    --- End diff --
    
    HadoopWriteConfigUtil has the same issue, its a public interface and uses 
in for attempt number. 
    it seems somewhat unlikely but more likely to be able to go over an int for 
task ids in spark then in say MapReduce.  we do have partitionId as an Int so 
if partitions go to Int and you have task failures then taskids could go over 
Int.  Looking at our options




---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21558: [SPARK-24552][SQL] Use task ID instead of attempt...

Reply via email to