[ https://issues.apache.org/jira/browse/KUDU-2812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Grant Henke updated KUDU-2812: ------------------------------ Component/s: (was: spark) > Problem with error reporting in kudu-backup > ------------------------------------------- > > Key: KUDU-2812 > URL: https://issues.apache.org/jira/browse/KUDU-2812 > Project: Kudu > Issue Type: Bug > Components: backup > Affects Versions: 1.9.0 > Reporter: Will Berkeley > Priority: Major > Labels: backup > Fix For: 1.10.0 > > > In KuduRestore.scala we have code like > {noformat} > // Fail the task if there are any errors. > val errorCount = session.getPendingErrors.getRowErrors.length > if (errorCount > 0) { > val errors = > > session.getPendingErrors.getRowErrors.take(5).map(_.getErrorStatus).mkString > throw new RuntimeException( > s"failed to write $errorCount rows from DataFrame to Kudu; > sample errors: $errors") > } > {noformat} > There's similar code in KuduContext.scala: > {noformat} > val errorCount = pendingErrors.getRowErrors.length > if (errorCount > 0) { > val errors = > pendingErrors.getRowErrors.take(5).map(_.getErrorStatus).mkString > throw new RuntimeException( > s"failed to write $errorCount rows from DataFrame to Kudu; sample > errors: $errors") > } > {noformat} > I've seen the former fail to print any sample errors. Taking a reference to > {{session.getPendingErrors.getRowErrors}} and using that through fixes this, > so it seems like there's some TOCTOU problem that can occur, probably because > multiple batches can be in flight at once. > The latter is most likely vulnerable to this as well. > This issue made diagnosing KUDU-2809 harder. -- This message was sent by Atlassian JIRA (v7.6.3#76005)