I can definitely see how having more detailed logs could be useful so I like what you're suggesting. I guess another option could be to make this configurable so you can pass in an argument to turn on the "showStandardStreams", by default it's false but while you're debugging this issue it would be turned on?
On Wed, 18 Nov 2020 at 09:03, Peter Vary <pv...@cloudera.com.invalid> wrote: > Hi Team, > > Recently I have been working on trying to reproduce the following CI > failure without success: > > > > > > *org.apache.iceberg.mr.hive.TestHiveIcebergStorageHandlerWithCustomCatalog > > testScanTable[fileFormat=PARQUET, engine=tez] FAILED > java.lang.IllegalArgumentException: Failed to execute Hive query 'SELECT * > FROM default.customers ORDER BY customer_id DESC': Error while processing > statement: FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.tez.TezTask Caused by: > org.apache.hive.service.cli.HiveSQLException: Error while processing > statement: FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.tez.TezTask* > > > Since I was unsuccessful reproing the case, and the provided error message > in CI logs are not really helpful this means I can not fix this flaky test > for now. :( > > After Marton Bods changes for adding logs for tests ( > https://github.com/apache/iceberg/pull/1712), we could have more info > about the failures in the test logs ( > *build/test-results/test/binary/output.bin*), but I am not sure if that > is retained and accessible after a CI run. > > I would like to propose adding the following to the build.gradle for the > CI runs: > > > > > > > > > > > > > *test { testLogging { > if ("true".equalsIgnoreCase(System.getenv('CI'))) { > events "failed", "passed"+ testLogging.showStandardStreams = true > } else { events "failed" } exceptionFormat "full" }}* > > > This would add the logs printed during the tests to the standard output > for the CI runs. Example can be seen here ( > https://github.com/pvary/iceberg/runs/1405960983) - only enabled standard > streams for the hive related tests in this patch to see the results. > > Pros: > > - Easily accessible log information for the failed runs > > Cons: > > - Harder to read CI logs > - Possible cost associated with retaining the logs > > > I think having more logs would be great, but I am not sure who pays the > bill and whether having bigger logs could cause any problem and whether the > CI is able to handle the increased amount of data. > > Any thoughts, comments, ideas? > > Thanks, > Peter >