[ https://issues.apache.org/jira/browse/TRAFODION-2109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376315#comment-15376315 ]
Selvaganesan Govindarajan commented on TRAFODION-2109: ------------------------------------------------------ Analysis of the issue reveals the following: One of the ESPs returned the internal error below while writing the error log record in hdfs. data_ = 0x7f8a38756d70 "\njava.nio.channels.ClosedChannelException\n org.apache.hadoop.hdfs.DFSOutputStream.checkClosed(DFSOutputStream.java:1635)\n org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:104)\n org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:58)\n java.io.DataOutputStream.write(DataOutputStream.java:107)\n java.io.FilterOutputStream.write(FilterOutputStream.java:97)\n org.trafodion.sql.HiveClient.hdfsWrite(HiveClient.java:278)\n" Hdfs scan returned an error -8413. This error row along with the error information is written to error log file in hdfs. But the writing to the error log file returned an error as shown in the above stack trace. This ESP is supposed to core with the asserting message shown below: ex_assert((retcode == HBASE_ACCESS_SUCCESS), "Error while writing the log file"); I have seen earlier ex_assert is ignored in release mode. Tried with ABORT_ON_ERROR=2034 in $MY_SQROOT/etc/ms.env.. This setting dumped the core for the ESP process or error -8413 though ABORT_ON_ERROR is set to 2034 because the error is thrown due to ex_assert. I think though it didn’t dump core earlier, it triggered error 2034. But, the actual error message was lost. > Load with log error rows returns the following error at times > ------------------------------------------------------------- > > Key: TRAFODION-2109 > URL: https://issues.apache.org/jira/browse/TRAFODION-2109 > Project: Apache Trafodion > Issue Type: Bug > Components: sql-exe > Affects Versions: 1.3-incubating > Reporter: Selvaganesan Govindarajan > Assignee: Selvaganesan Govindarajan > Fix For: 2.1-incubating > > > load with log error rows into <trafodion_table> > select * from hive.hive.<hive_table> > *** ERROR[2034] $Z0014JQ:5192: Operating system error 201 while communicating > with server process $Z021FS0:425. > > *** ERROR[2034] $Z0014JQ:5192: Operating system error 201 while communicating > with server process $Z021FS0:425. > > *** ERROR[2034] $Z0014JQ:5192: Operating system error 201 while communicating > with server process $Z021FS0:425. > > --- 0 row(s) loaded. -- This message was sent by Atlassian JIRA (v6.3.4#6332)