[jira] [Commented] (TRAFODION-2109) Load with log error rows returns the following error at times
[ https://issues.apache.org/jira/browse/TRAFODION-2109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15388446#comment-15388446 ] ASF GitHub Bot commented on TRAFODION-2109: --- Github user selvaganesang closed the pull request at: https://github.com/apache/incubator-trafodion/pull/593 > Load with log error rows returns the following error at times > - > > Key: TRAFODION-2109 > URL: https://issues.apache.org/jira/browse/TRAFODION-2109 > Project: Apache Trafodion > Issue Type: Bug > Components: sql-exe >Affects Versions: 1.3-incubating >Reporter: Selvaganesan Govindarajan >Assignee: Selvaganesan Govindarajan > Fix For: 2.1-incubating > > > load with log error rows into > select * from hive.hive. > *** ERROR[2034] $Z0014JQ:5192: Operating system error 201 while communicating > with server process $Z021FS0:425. > > *** ERROR[2034] $Z0014JQ:5192: Operating system error 201 while communicating > with server process $Z021FS0:425. > > *** ERROR[2034] $Z0014JQ:5192: Operating system error 201 while communicating > with server process $Z021FS0:425. > > --- 0 row(s) loaded. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TRAFODION-2109) Load with log error rows returns the following error at times
[ https://issues.apache.org/jira/browse/TRAFODION-2109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15379812#comment-15379812 ] ASF GitHub Bot commented on TRAFODION-2109: --- Github user selvaganesang commented on a diff in the pull request: https://github.com/apache/incubator-trafodion/pull/593#discussion_r71015648 --- Diff: core/sql/executor/ExHbaseAccess.cpp --- @@ -2996,6 +3003,14 @@ void ExHbaseAccessTcb::handleException(NAHeap *heap, errorMsgLen = strlen("[UNKNOWN EXCEPTION]\n"); --- End diff -- will do > Load with log error rows returns the following error at times > - > > Key: TRAFODION-2109 > URL: https://issues.apache.org/jira/browse/TRAFODION-2109 > Project: Apache Trafodion > Issue Type: Bug > Components: sql-exe >Affects Versions: 1.3-incubating >Reporter: Selvaganesan Govindarajan >Assignee: Selvaganesan Govindarajan > Fix For: 2.1-incubating > > > load with log error rows into > select * from hive.hive. > *** ERROR[2034] $Z0014JQ:5192: Operating system error 201 while communicating > with server process $Z021FS0:425. > > *** ERROR[2034] $Z0014JQ:5192: Operating system error 201 while communicating > with server process $Z021FS0:425. > > *** ERROR[2034] $Z0014JQ:5192: Operating system error 201 while communicating > with server process $Z021FS0:425. > > --- 0 row(s) loaded. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TRAFODION-2109) Load with log error rows returns the following error at times
[ https://issues.apache.org/jira/browse/TRAFODION-2109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15379804#comment-15379804 ] ASF GitHub Bot commented on TRAFODION-2109: --- Github user selvaganesang commented on a diff in the pull request: https://github.com/apache/incubator-trafodion/pull/593#discussion_r71015190 --- Diff: core/sql/executor/ExHbaseAccess.cpp --- @@ -2967,22 +2968,28 @@ void ExHbaseAccessTcb::handleException(NAHeap *heap, ComCondition *errorCond, ExpHbaseInterface * ehi, NABoolean & LoggingFileCreated, -char *loggingFileName) +char *loggingFileName, +ComDiagsArea **loggingErrorDiags) { Lng32 errorMsgLen = 0; charBuf *cBuf = NULL; char *errorMsg; Lng32 retcode; + if (*loggingErrorDiags != NULL) --- End diff -- Basically, we want to skip the error rows in the source and continue to load. Load command has constructs to log the error row and the error information about it. We just want to report the error while logging the error rows as warning at the end of the load command. This check is skipping the logging once an error is encountered in the earlier attempt to log the error row. > Load with log error rows returns the following error at times > - > > Key: TRAFODION-2109 > URL: https://issues.apache.org/jira/browse/TRAFODION-2109 > Project: Apache Trafodion > Issue Type: Bug > Components: sql-exe >Affects Versions: 1.3-incubating >Reporter: Selvaganesan Govindarajan >Assignee: Selvaganesan Govindarajan > Fix For: 2.1-incubating > > > load with log error rows into > select * from hive.hive. > *** ERROR[2034] $Z0014JQ:5192: Operating system error 201 while communicating > with server process $Z021FS0:425. > > *** ERROR[2034] $Z0014JQ:5192: Operating system error 201 while communicating > with server process $Z021FS0:425. > > *** ERROR[2034] $Z0014JQ:5192: Operating system error 201 while communicating > with server process $Z021FS0:425. > > --- 0 row(s) loaded. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TRAFODION-2109) Load with log error rows returns the following error at times
[ https://issues.apache.org/jira/browse/TRAFODION-2109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15379707#comment-15379707 ] ASF GitHub Bot commented on TRAFODION-2109: --- Github user DaveBirdsall commented on a diff in the pull request: https://github.com/apache/incubator-trafodion/pull/593#discussion_r71007538 --- Diff: core/sql/executor/ExHbaseAccess.cpp --- @@ -2967,22 +2968,28 @@ void ExHbaseAccessTcb::handleException(NAHeap *heap, ComCondition *errorCond, ExpHbaseInterface * ehi, NABoolean & LoggingFileCreated, -char *loggingFileName) +char *loggingFileName, +ComDiagsArea **loggingErrorDiags) { Lng32 errorMsgLen = 0; charBuf *cBuf = NULL; char *errorMsg; Lng32 retcode; + if (*loggingErrorDiags != NULL) --- End diff -- Not quite sure I understand this logic. If a ComDiagsArea object has already been allocated, does that mean we have already reported the exception somewhere else? Or perhaps this is not the first exception and we simply don't want to log more than one? > Load with log error rows returns the following error at times > - > > Key: TRAFODION-2109 > URL: https://issues.apache.org/jira/browse/TRAFODION-2109 > Project: Apache Trafodion > Issue Type: Bug > Components: sql-exe >Affects Versions: 1.3-incubating >Reporter: Selvaganesan Govindarajan >Assignee: Selvaganesan Govindarajan > Fix For: 2.1-incubating > > > load with log error rows into > select * from hive.hive. > *** ERROR[2034] $Z0014JQ:5192: Operating system error 201 while communicating > with server process $Z021FS0:425. > > *** ERROR[2034] $Z0014JQ:5192: Operating system error 201 while communicating > with server process $Z021FS0:425. > > *** ERROR[2034] $Z0014JQ:5192: Operating system error 201 while communicating > with server process $Z021FS0:425. > > --- 0 row(s) loaded. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TRAFODION-2109) Load with log error rows returns the following error at times
[ https://issues.apache.org/jira/browse/TRAFODION-2109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15379698#comment-15379698 ] ASF GitHub Bot commented on TRAFODION-2109: --- Github user DaveBirdsall commented on a diff in the pull request: https://github.com/apache/incubator-trafodion/pull/593#discussion_r71007168 --- Diff: core/sql/executor/ExHbaseAccess.cpp --- @@ -2996,6 +3003,14 @@ void ExHbaseAccessTcb::handleException(NAHeap *heap, errorMsgLen = strlen("[UNKNOWN EXCEPTION]\n"); --- End diff -- Looks like errorMsg is uninitialized in this code path. Perhaps we need to add, "errorMsg = "[UNKNOWN EXCEPTION]" here? (And maybe change this line to errorMsgLen = strlen(errorMsg)?) > Load with log error rows returns the following error at times > - > > Key: TRAFODION-2109 > URL: https://issues.apache.org/jira/browse/TRAFODION-2109 > Project: Apache Trafodion > Issue Type: Bug > Components: sql-exe >Affects Versions: 1.3-incubating >Reporter: Selvaganesan Govindarajan >Assignee: Selvaganesan Govindarajan > Fix For: 2.1-incubating > > > load with log error rows into > select * from hive.hive. > *** ERROR[2034] $Z0014JQ:5192: Operating system error 201 while communicating > with server process $Z021FS0:425. > > *** ERROR[2034] $Z0014JQ:5192: Operating system error 201 while communicating > with server process $Z021FS0:425. > > *** ERROR[2034] $Z0014JQ:5192: Operating system error 201 while communicating > with server process $Z021FS0:425. > > --- 0 row(s) loaded. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TRAFODION-2109) Load with log error rows returns the following error at times
[ https://issues.apache.org/jira/browse/TRAFODION-2109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15378462#comment-15378462 ] ASF GitHub Bot commented on TRAFODION-2109: --- GitHub user selvaganesang opened a pull request: https://github.com/apache/incubator-trafodion/pull/593 [TRAFODION-2109] Load with log error rows returns SQL error 2034 at t… …imes Changed the error at the time of logging into error rows into a warning so that load can continue. Also, improved the error reporting to display stack trace when java method call fails at the time of loading and unloading. You can merge this pull request into a Git repository by running: $ git pull https://github.com/selvaganesang/incubator-trafodion trafodion-2109 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-trafodion/pull/593.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #593 commit 37f7789877d7991c73e8d202735d399ab93d8fac Author: selvaganesang Date: 2016-07-14T21:44:56Z [TRAFODION-2109] Load with log error rows returns SQL error 2034 at times Changed the error at the time of logging into error rows into a warning so that load can continue. Also, improved the error reporting to display stack trace when java method call fails at the time of loading and unloading. > Load with log error rows returns the following error at times > - > > Key: TRAFODION-2109 > URL: https://issues.apache.org/jira/browse/TRAFODION-2109 > Project: Apache Trafodion > Issue Type: Bug > Components: sql-exe >Affects Versions: 1.3-incubating >Reporter: Selvaganesan Govindarajan >Assignee: Selvaganesan Govindarajan > Fix For: 2.1-incubating > > > load with log error rows into > select * from hive.hive. > *** ERROR[2034] $Z0014JQ:5192: Operating system error 201 while communicating > with server process $Z021FS0:425. > > *** ERROR[2034] $Z0014JQ:5192: Operating system error 201 while communicating > with server process $Z021FS0:425. > > *** ERROR[2034] $Z0014JQ:5192: Operating system error 201 while communicating > with server process $Z021FS0:425. > > --- 0 row(s) loaded. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TRAFODION-2109) Load with log error rows returns the following error at times
[ https://issues.apache.org/jira/browse/TRAFODION-2109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376315#comment-15376315 ] Selvaganesan Govindarajan commented on TRAFODION-2109: -- Analysis of the issue reveals the following: One of the ESPs returned the internal error below while writing the error log record in hdfs. data_ = 0x7f8a38756d70 "\njava.nio.channels.ClosedChannelException\n org.apache.hadoop.hdfs.DFSOutputStream.checkClosed(DFSOutputStream.java:1635)\n org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:104)\n org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:58)\n java.io.DataOutputStream.write(DataOutputStream.java:107)\n java.io.FilterOutputStream.write(FilterOutputStream.java:97)\n org.trafodion.sql.HiveClient.hdfsWrite(HiveClient.java:278)\n" Hdfs scan returned an error -8413. This error row along with the error information is written to error log file in hdfs. But the writing to the error log file returned an error as shown in the above stack trace. This ESP is supposed to core with the asserting message shown below: ex_assert((retcode == HBASE_ACCESS_SUCCESS), "Error while writing the log file"); I have seen earlier ex_assert is ignored in release mode. Tried with ABORT_ON_ERROR=2034 in $MY_SQROOT/etc/ms.env.. This setting dumped the core for the ESP process or error -8413 though ABORT_ON_ERROR is set to 2034 because the error is thrown due to ex_assert. I think though it didn’t dump core earlier, it triggered error 2034. But, the actual error message was lost. > Load with log error rows returns the following error at times > - > > Key: TRAFODION-2109 > URL: https://issues.apache.org/jira/browse/TRAFODION-2109 > Project: Apache Trafodion > Issue Type: Bug > Components: sql-exe >Affects Versions: 1.3-incubating >Reporter: Selvaganesan Govindarajan >Assignee: Selvaganesan Govindarajan > Fix For: 2.1-incubating > > > load with log error rows into > select * from hive.hive. > *** ERROR[2034] $Z0014JQ:5192: Operating system error 201 while communicating > with server process $Z021FS0:425. > > *** ERROR[2034] $Z0014JQ:5192: Operating system error 201 while communicating > with server process $Z021FS0:425. > > *** ERROR[2034] $Z0014JQ:5192: Operating system error 201 while communicating > with server process $Z021FS0:425. > > --- 0 row(s) loaded. -- This message was sent by Atlassian JIRA (v6.3.4#6332)