While we are throwing bizarre errors around... maybe someone who has more experience reading sql logs will spot something interesting, or can explain what is happening here.
Last night I migrated an update to my data integration workflow (filters) that load data using AIE and then push it into CTM:People, CTM:People Permission Groups, User, and several other supporting forms. This was modified and heavily tested (ran about 50,000 records against it in 10 AIE runs of 5,000+ records), plus an escalation that created 8,000 more records, on my development server which is a clone of the production db on a slightly less capable server. The original code has been running since February 2008, before we went live in May. When I ran the same data exchanges from the same data on the production server it managed to induce several crashes of the thread designated for AIE, and of the AIE service. Here is a block from my arerror.log: Tue Oct 07 01:34:47 2008 : Action Request System(R) Server Version 7.1.00 Patch 002 200802011900 (c) Copyright 1991-2007 BMC Software, Inc. Tue Oct 07 02:03:30 2008 390694 : Failure during SQL operation to the database (ARERR 552) Tue Oct 07 02:03:30 2008 The COMMIT TRANSACTION request has no corresponding BEGIN TRANSACTION. (SQL Server 3902) Tue Oct 07 02:03:30 2008 390694 : AR System server terminated -- fatal error encountered (ARNOTE 21) Tue Oct 07 02:58:35 2008 390694 : Failure during SQL operation to the database (ARERR 552) Tue Oct 07 02:58:35 2008 The COMMIT TRANSACTION request has no corresponding BEGIN TRANSACTION. (SQL Server 3902) Tue Oct 07 02:58:35 2008 390694 : AR System server terminated -- fatal error encountered (ARNOTE 21) RPC Queue 390694 is a private queue that all of the AIE Data Exchanges are defined to use. It was 2 min 4 max but during troubleshooting I dialed it back to 1 and 1 and have left it there. I can find no errors at all in the filter logs, but they are visible in the sql logs: Lines start with <SQL > <TID: 0000004484> <RPC ID: 0000084984> <Queue: Prv:390694> <Client-RPC: 390694 > <USER: Demo ...and all lines NOT for TID: 0000004484 were removed. ... (abc1234 is the substitute login name) LINE 72711 Tue Oct 07 2008 02:58:35.6890 */SELECT licTypeApp FROM user_cache WHERE serverId = 0 AND entryId = '000000000154722' AND (NOT (licTypeApp IS NULL)) Tue Oct 07 2008 02:58:35.6890 */OK Tue Oct 07 2008 02:58:35.6890 */DELETE FROM user_cache WHERE serverId = 0 AND entryId = '000000000154722' Tue Oct 07 2008 02:58:35.6890 */OK Tue Oct 07 2008 02:58:35.6890 */INSERT INTO user_cache (serverId,entryId,userName,password,authUserName,shortAuthString,notifyMech,licType,licTypeFText,licTypeReserv1,timestamp,email,shortGroup,shortCompGroup,validateKey,badPwd,badPwdTotal,pwdChange) VALUES (0,'000000000154722','abc1234','lhUMXReSqE5le+Bj4mT61FAK4tP9',NULL,NULL,2,0,0,0,1223366315,'[EMAIL PROTECTED]','1000000006;1000000008;',NULL,'JBg+RzJnCTBhyjsr7SyIH2SskFGI',0,0,0) Tue Oct 07 2008 02:58:35.6890 */OK Tue Oct 07 2008 02:58:35.6890 */SELECT fixedLicChange FROM user_cache WHERE serverId = 0 AND entryId = '000000000154722' Tue Oct 07 2008 02:58:35.6890 */OK Tue Oct 07 2008 02:58:35.6890 */Set 32 byte LOB into the above row ... Tue Oct 07 2008 02:58:35.7050 */OK Tue Oct 07 2008 02:58:35.7050 */UPDATE T30 SET C104='1000000008;1000000006;',C60988=';''abc1234'';',C5='Demo',C6=1223366315 WHERE C1 = '000000000154722' Tue Oct 07 2008 02:58:35.7200 */OK Tue Oct 07 2008 02:58:35.7200 */UPDATE T30 SET C122 = NULL WHERE C1 = '000000000154722' Tue Oct 07 2008 02:58:35.7200 */OK Tue Oct 07 2008 02:58:35.7200 */SELECT C101,C102,C103,C104,C108,C109,C110,C119,C122 FROM T30 WHERE C1 = '000000000154722' Tue Oct 07 2008 02:58:35.7200 */OK Tue Oct 07 2008 02:58:35.7200 */SELECT userName,password,shortAuthString,longAuthString,fixedLicChange,shortGroup,longGroup,shortCompGroup,longCompGroup,badPwdTotal FROM user_cache WHERE entryId = '000000000154722' Tue Oct 07 2008 02:58:35.7510 */OK Tue Oct 07 2008 02:58:35.7510 */SELECT licTypeApp FROM user_cache WHERE serverId = 0 AND entryId = '000000000154722' AND (NOT (licTypeApp IS NULL)) Tue Oct 07 2008 02:58:35.7510 */OK Tue Oct 07 2008 02:58:35.7510 */DELETE FROM user_cache WHERE serverId = 0 AND entryId = '000000000154722' Tue Oct 07 2008 02:58:35.7510 */OK Tue Oct 07 2008 02:58:35.7510 */INSERT INTO user_cache (serverId,entryId,userName,password,authUserName,shortAuthString,notifyMech,licType,licTypeFText,licTypeReserv1,timestamp,email,shortGroup,shortCompGroup,validateKey,badPwd,badPwdTotal,pwdChange) VALUES (0,'000000000154722','abc1234','Zpi+Oj/t90D2TAsyZN5pp3+FsYMj',NULL,NULL,2,0,0,0,1223366315,'[EMAIL PROTECTED]','1000000008;1000000006;',NULL,'6y7llvQQFCA+9U6+IZOfwU+rPJ/G',0,0,0) Tue Oct 07 2008 02:58:35.7510 */OK Tue Oct 07 2008 02:58:35.7510 */SELECT fixedLicChange FROM user_cache WHERE serverId = 0 AND entryId = '000000000154722' Tue Oct 07 2008 02:58:35.7510 */OK Tue Oct 07 2008 02:58:35.7510 */Set 32 byte LOB into the above row ... Tue Oct 07 2008 02:58:35.7670 */OK Tue Oct 07 2008 02:58:35.7670 */COMMIT TRANSACTION Tue Oct 07 2008 02:58:35.7670 */*** ERROR *** The COMMIT TRANSACTION request has no corresponding BEGIN TRANSACTION. (SQL Server 3902) Tue Oct 07 2008 02:58:35.7670 */CLOSE DATABASE Tue Oct 07 2008 02:58:35.7670 */DROP TABLE AR0int4484 Tue Oct 07 2008 02:58:35.7670 */*** ERROR *** Cannot drop the table 'AR0int4484', because it does not exist or you do not have permission. (SQL Server 3701) Tue Oct 07 2008 02:58:35.7670 */DROP TABLE AR0float4484 Tue Oct 07 2008 02:58:35.7670 */*** ERROR *** Cannot drop the table 'AR0float4484', because it does not exist or you do not have permission. (SQL Server 3701) Tue Oct 07 2008 02:58:35.7670 */DROP TABLE AR0char4484 Tue Oct 07 2008 02:58:35.7670 */*** ERROR *** Cannot drop the table 'AR0char4484', because it does not exist or you do not have permission. (SQL Server 3701) Tue Oct 07 2008 02:58:35.7670 */DROP TABLE AR0decimal4484 Tue Oct 07 2008 02:58:35.7670 */*** ERROR *** Cannot drop the table 'AR0decimal4484', because it does not exist or you do not have permission. (SQL Server 3701) Tue Oct 07 2008 02:58:35.7830 */CONNECT ARSystem, USER ARAdmin LINE 72914 The only BEGIN TRANSACTION I found for TID: 0000004484 was at Tue Oct 07 2008 02:58:33.5340 */BEGIN TRANSACTION, and was at LINE 68813 and had 91 entries for that TID between it and the top of this selection, mostly UPDATE <Table> SET <Fields> commands. A separate escalation that created 8,000 people and user record pairs from updated source records (using the same filters fired from AIE) took an hour, but did not generate any errors at all. There were never any errors recorded on the remote SQL Server (2005 x64 Sp2). It only occurred in a couple of instances, during 3 of the first 5 data exchanges I ran out of 10 defined, using the same data source that were used to test on development. I have an issue open on this, of course, since it crashed the thread several times and I have not seen anything quite like this before. Christopher Strauss, Ph.D. Call Tracking Administration Manager University of North Texas Computing & IT Center http://itsm.unt.edu/ _______________________________________________________________________________ UNSUBSCRIBE or access ARSlist Archives at www.arslist.org Platinum Sponsor: www.rmsportal.com ARSlist: "Where the Answers Are"