[jira] [Commented] (TS-1330) Logging related segfault in 3.2.0

2013-08-03 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13728651#comment-13728651
 ] 

ASF subversion and git services commented on TS-1330:
-

Commit 1bb12edf9cd9b0076eaf39b8665909846743ee73 in branch refs/heads/3.3.x from 
[~zwoop]
[ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=1bb12ed ]

Merge branch 'master' into 3.3.x

* master: (295 commits)
  TS-2081 Make the WCCP addr configuration LOCAL
  TS-2093: Check bounds on plugin stat creation.
  TS-2092 Use of uninitialized member in HdrHeap.
  [TS-2052] ET_SSL thread spinning
  TS-2090 Make proxy.config.allocator.enable_reclaim default based on build 
instructions
  TS-1006: adjust some reclaimable-freelist's default configuration
  Updated newish tests to the gitignore
  Fix autoconf checks for mcheck_pedantic()
  Fix configure check for eventfd
  TS-1330: Fix logging core at checkout_write()
  add some standard extensions to ignore list
  TS-1953: remove version check from stable plugins
  TS-1953: Remove check_ts_version() from experimental plugins
  TS-1953: remove check_ts_version() from examples
  Added TS-2086.
  TS-2086 Remove a few more unused configs
  Added TS-1685
  TS-1685 Remove TS_MICRO and fellas
  Added Ts-1255
  TS-1255 Fix the types for all overridable configs. This was actually a real 
bug in the code, in that all float configurations were actually treated as 
integer, rendering them useless
  ...


 Logging related segfault in 3.2.0
 -

 Key: TS-1330
 URL: https://issues.apache.org/jira/browse/TS-1330
 Project: Traffic Server
  Issue Type: Bug
  Components: Logging
Affects Versions: 3.2.0
 Environment: ATS 3.2.0 on RHEL 6.2 64-bit
Reporter: David Carlin
Assignee: Yunkai Zhang
Priority: Critical
  Labels: A, crash
 Fix For: 3.3.5

 Attachments: 0001-A-testing-patch-to-verify-my-analysis.patch, 
 0001-TS-1330-Fix-logging-core-at-checkout_write.patch, 
 0001-TS-1330-Fix-logging-core-at-checkout_write.patch.V2


 I observed the following crash once on one of our ATS boxes - possibly 
 related to TS-1240
 Jul  2 13:56:56 l2 traffic_server[25853]: {0x2b0a391e1700} ERROR: 
 [SSL_NetVConnection::ssl_read_from_net] SSL_ERROR_SYSCALL, underlying IO 
 error: Connection reset by peer
 Jul  2 13:59:56 l2 kernel: [ET_NET 1][25855]: segfault at c ip 
 0058e083 sp 2b0a2982b740 error 6
 Jul  2 13:59:56 l2 kernel: [ET_NET 3][25857]: segfault at 84 ip 
 0058e083 sp 2b0a29a31740 error 6 in traffic_server[40+34]
 Jul  2 13:59:56 l2 kernel: in traffic_server[40+34]
 Jul  2 14:02:59 l2 traffic_cop[25901]: (test) read timeout [18 ]
 Jul  2 14:02:59 l2 traffic_cop[25901]: server heartbeat failed [1]
 Jul  2 14:03:08 l2 traffic_manager[25826]: {0x7f3f088607e0} FATAL: 
 [LocalManager::pollMgmtProcessServer] Error in read (errno: 104)
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} FATAL:  (last 
 system error 104: Connection reset by peer)
 Jul  2 14:03:09 l2 traffic_cop[25901]: cannot find traffic_server [1]
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} ERROR: 
 [LocalManager::sendMgmtMsgToProcesses] Error writing message
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} ERROR:  (last 
 system error 32: Broken pipe)
 Jul  2 14:03:12 l2 traffic_cop[25901]: cop received child status signal 
 [25826 35584]
 Jul  2 14:03:12 l2 traffic_cop[25901]: traffic_manager not running, making 
 sure traffic_server is dead
 Jul  2 14:03:12 l2 traffic_cop[25901]: spawning traffic_manager
 Jul  2 14:03:13 l2 traffic_manager[18267]: NOTE: --- Manager Starting ---
 Jul  2 14:03:13 l2 traffic_manager[18267]: NOTE: Manager Version: Apache 
 Traffic Server - traffic_manager - 3.2.0 - (build # 52518 on Jun 25 2012 at 
 18:22:12)
 Jul  2 14:03:13 l2 traffic_manager[18267]: {0x7fe63de3f7e0} STATUS: opened 
 /home/y/logs/trafficserver/manager.log
 Jul  2 14:03:15 l2 traffic_server[18322]: NOTE: --- Server Starting ---
 Jul  2 14:03:15 l2 traffic_server[18322]: NOTE: Server Version: Apache 
 Traffic Server - traffic_server - 3.2.0 - (build # 52518 on Jun 25 2012 at 
 18:22:31)
 Jul  2 14:03:15 l2 traffic_server[18322]: {0x2b77573ab860} STATUS: opened 
 /home/y/logs/trafficserver/diags.log
 Jul  2 14:03:15 l2 traffic_server[18322]: {0x2b77573ab860} ERROR: Cannot 
 insert duplicate!
 Jul  2 14:03:22 l2 traffic_cop[25901]: server heartbeat succeeded
 [Jul  2 13:56:56.304] Server {0x2b0a391e1700} ERROR: 
 [SSL_NetVConnection::ssl_read_from_net] SSL_ERROR_SYSCALL, underlying IO 
 error: Connection reset by peer
 NOTE: Traffic Server received Sig 11: Segmentation fault
 NOTE: Traffic Server received Sig 11: Segmentation fault
 /home/y/bin/traffic_server - STACK TRACE: 
 

[jira] [Commented] (TS-1330) Logging related segfault in 3.2.0

2013-08-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13727276#comment-13727276
 ] 

ASF subversion and git services commented on TS-1330:
-

Commit c1c3db9d7b524176e611a6aba3066d1a732b4bd2 in branch refs/heads/t from 
[~yunkai]
[ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=c1c3db9 ]

TS-1330: Fix logging core at checkout_write()

== Analysis ==
1. One thread(A) got an LogBuffer object in LogObject::_checkout_write() and ran
   into LogBuffer::checkout_write(), before it updated the LogBuffer::m_state,
   it was scheduled out by the kernel.

2. Other threads got the same LogBuffer object, but they all finished
   _checkout_write() codes and made the LogBuffer object become full, finally
   one of these thread flushed the LogBuffer object into write_list queue.

3. The LogBuffer object's m_references == 0 at this time.

4. The flushing thread would call LogBufferManager::flush_buffers() to loop
   write_list. As the LogBuffer's m_references == 0, so it would be flushed
   and deleted.

5. Now, thread(A) would be scheduled in by the kernel, but the LogBuffer object
   has been deleted, so the m_buffer == NULL, that is why entry_header is 0x0.

== Solution ==
1. Remove flushing code from LogObject::log(), we should flush LogBuffer into
   queue only when new_buffer is created, so that we can promise that LogBuffer
   must be flushed only once.

2. The flushing thread should both check m_references and m_state.s.num_writers.

3. When allocate a new_buffer, should avoid memory leak.

Signed-off-by: Yunkai Zhang qiushu@taobao.com


 Logging related segfault in 3.2.0
 -

 Key: TS-1330
 URL: https://issues.apache.org/jira/browse/TS-1330
 Project: Traffic Server
  Issue Type: Bug
  Components: Logging
Affects Versions: 3.2.0
 Environment: ATS 3.2.0 on RHEL 6.2 64-bit
Reporter: David Carlin
Assignee: Yunkai Zhang
Priority: Critical
  Labels: A, crash
 Fix For: 3.3.5

 Attachments: 0001-A-testing-patch-to-verify-my-analysis.patch, 
 0001-TS-1330-Fix-logging-core-at-checkout_write.patch, 
 0001-TS-1330-Fix-logging-core-at-checkout_write.patch.V2


 I observed the following crash once on one of our ATS boxes - possibly 
 related to TS-1240
 Jul  2 13:56:56 l2 traffic_server[25853]: {0x2b0a391e1700} ERROR: 
 [SSL_NetVConnection::ssl_read_from_net] SSL_ERROR_SYSCALL, underlying IO 
 error: Connection reset by peer
 Jul  2 13:59:56 l2 kernel: [ET_NET 1][25855]: segfault at c ip 
 0058e083 sp 2b0a2982b740 error 6
 Jul  2 13:59:56 l2 kernel: [ET_NET 3][25857]: segfault at 84 ip 
 0058e083 sp 2b0a29a31740 error 6 in traffic_server[40+34]
 Jul  2 13:59:56 l2 kernel: in traffic_server[40+34]
 Jul  2 14:02:59 l2 traffic_cop[25901]: (test) read timeout [18 ]
 Jul  2 14:02:59 l2 traffic_cop[25901]: server heartbeat failed [1]
 Jul  2 14:03:08 l2 traffic_manager[25826]: {0x7f3f088607e0} FATAL: 
 [LocalManager::pollMgmtProcessServer] Error in read (errno: 104)
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} FATAL:  (last 
 system error 104: Connection reset by peer)
 Jul  2 14:03:09 l2 traffic_cop[25901]: cannot find traffic_server [1]
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} ERROR: 
 [LocalManager::sendMgmtMsgToProcesses] Error writing message
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} ERROR:  (last 
 system error 32: Broken pipe)
 Jul  2 14:03:12 l2 traffic_cop[25901]: cop received child status signal 
 [25826 35584]
 Jul  2 14:03:12 l2 traffic_cop[25901]: traffic_manager not running, making 
 sure traffic_server is dead
 Jul  2 14:03:12 l2 traffic_cop[25901]: spawning traffic_manager
 Jul  2 14:03:13 l2 traffic_manager[18267]: NOTE: --- Manager Starting ---
 Jul  2 14:03:13 l2 traffic_manager[18267]: NOTE: Manager Version: Apache 
 Traffic Server - traffic_manager - 3.2.0 - (build # 52518 on Jun 25 2012 at 
 18:22:12)
 Jul  2 14:03:13 l2 traffic_manager[18267]: {0x7fe63de3f7e0} STATUS: opened 
 /home/y/logs/trafficserver/manager.log
 Jul  2 14:03:15 l2 traffic_server[18322]: NOTE: --- Server Starting ---
 Jul  2 14:03:15 l2 traffic_server[18322]: NOTE: Server Version: Apache 
 Traffic Server - traffic_server - 3.2.0 - (build # 52518 on Jun 25 2012 at 
 18:22:31)
 Jul  2 14:03:15 l2 traffic_server[18322]: {0x2b77573ab860} STATUS: opened 
 /home/y/logs/trafficserver/diags.log
 Jul  2 14:03:15 l2 traffic_server[18322]: {0x2b77573ab860} ERROR: Cannot 
 insert duplicate!
 Jul  2 14:03:22 l2 traffic_cop[25901]: server heartbeat succeeded
 [Jul  2 13:56:56.304] Server {0x2b0a391e1700} ERROR: 
 [SSL_NetVConnection::ssl_read_from_net] SSL_ERROR_SYSCALL, underlying IO 
 error: Connection reset by peer
 

[jira] [Commented] (TS-1330) Logging related segfault in 3.2.0

2013-08-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13727281#comment-13727281
 ] 

ASF subversion and git services commented on TS-1330:
-

Commit 3e05261c675b15a76f178e1d3e579c6941f42d20 in branch refs/heads/master 
from [~yunkai]
[ https://git-wip-us.apache.org/repos/asf?p=trafficserver.git;h=3e05261 ]

TS-1330: Fix logging core at checkout_write()

== Analysis ==
1. One thread(A) got an LogBuffer object in LogObject::_checkout_write() and ran
   into LogBuffer::checkout_write(), before it updated the LogBuffer::m_state,
   it was scheduled out by the kernel.

2. Other threads got the same LogBuffer object, but they all finished
   _checkout_write() codes and made the LogBuffer object become full, finally
   one of these thread flushed the LogBuffer object into write_list queue.

3. The LogBuffer object's m_references == 0 at this time.

4. The flushing thread would call LogBufferManager::flush_buffers() to loop
   write_list. As the LogBuffer's m_references == 0, so it would be flushed
   and deleted.

5. Now, thread(A) would be scheduled in by the kernel, but the LogBuffer object
   has been deleted, so the m_buffer == NULL, that is why entry_header is 0x0.

== Solution ==
1. Remove flushing code from LogObject::log(), we should flush LogBuffer into
   queue only when new_buffer is created, so that we can promise that LogBuffer
   must be flushed only once.

2. The flushing thread should both check m_references and m_state.s.num_writers.

3. When allocate a new_buffer, should avoid memory leak.

Signed-off-by: Yunkai Zhang qiushu@taobao.com


 Logging related segfault in 3.2.0
 -

 Key: TS-1330
 URL: https://issues.apache.org/jira/browse/TS-1330
 Project: Traffic Server
  Issue Type: Bug
  Components: Logging
Affects Versions: 3.2.0
 Environment: ATS 3.2.0 on RHEL 6.2 64-bit
Reporter: David Carlin
Assignee: Yunkai Zhang
Priority: Critical
  Labels: A, crash
 Fix For: 3.3.5

 Attachments: 0001-A-testing-patch-to-verify-my-analysis.patch, 
 0001-TS-1330-Fix-logging-core-at-checkout_write.patch, 
 0001-TS-1330-Fix-logging-core-at-checkout_write.patch.V2


 I observed the following crash once on one of our ATS boxes - possibly 
 related to TS-1240
 Jul  2 13:56:56 l2 traffic_server[25853]: {0x2b0a391e1700} ERROR: 
 [SSL_NetVConnection::ssl_read_from_net] SSL_ERROR_SYSCALL, underlying IO 
 error: Connection reset by peer
 Jul  2 13:59:56 l2 kernel: [ET_NET 1][25855]: segfault at c ip 
 0058e083 sp 2b0a2982b740 error 6
 Jul  2 13:59:56 l2 kernel: [ET_NET 3][25857]: segfault at 84 ip 
 0058e083 sp 2b0a29a31740 error 6 in traffic_server[40+34]
 Jul  2 13:59:56 l2 kernel: in traffic_server[40+34]
 Jul  2 14:02:59 l2 traffic_cop[25901]: (test) read timeout [18 ]
 Jul  2 14:02:59 l2 traffic_cop[25901]: server heartbeat failed [1]
 Jul  2 14:03:08 l2 traffic_manager[25826]: {0x7f3f088607e0} FATAL: 
 [LocalManager::pollMgmtProcessServer] Error in read (errno: 104)
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} FATAL:  (last 
 system error 104: Connection reset by peer)
 Jul  2 14:03:09 l2 traffic_cop[25901]: cannot find traffic_server [1]
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} ERROR: 
 [LocalManager::sendMgmtMsgToProcesses] Error writing message
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} ERROR:  (last 
 system error 32: Broken pipe)
 Jul  2 14:03:12 l2 traffic_cop[25901]: cop received child status signal 
 [25826 35584]
 Jul  2 14:03:12 l2 traffic_cop[25901]: traffic_manager not running, making 
 sure traffic_server is dead
 Jul  2 14:03:12 l2 traffic_cop[25901]: spawning traffic_manager
 Jul  2 14:03:13 l2 traffic_manager[18267]: NOTE: --- Manager Starting ---
 Jul  2 14:03:13 l2 traffic_manager[18267]: NOTE: Manager Version: Apache 
 Traffic Server - traffic_manager - 3.2.0 - (build # 52518 on Jun 25 2012 at 
 18:22:12)
 Jul  2 14:03:13 l2 traffic_manager[18267]: {0x7fe63de3f7e0} STATUS: opened 
 /home/y/logs/trafficserver/manager.log
 Jul  2 14:03:15 l2 traffic_server[18322]: NOTE: --- Server Starting ---
 Jul  2 14:03:15 l2 traffic_server[18322]: NOTE: Server Version: Apache 
 Traffic Server - traffic_server - 3.2.0 - (build # 52518 on Jun 25 2012 at 
 18:22:31)
 Jul  2 14:03:15 l2 traffic_server[18322]: {0x2b77573ab860} STATUS: opened 
 /home/y/logs/trafficserver/diags.log
 Jul  2 14:03:15 l2 traffic_server[18322]: {0x2b77573ab860} ERROR: Cannot 
 insert duplicate!
 Jul  2 14:03:22 l2 traffic_cop[25901]: server heartbeat succeeded
 [Jul  2 13:56:56.304] Server {0x2b0a391e1700} ERROR: 
 [SSL_NetVConnection::ssl_read_from_net] SSL_ERROR_SYSCALL, underlying IO 
 error: Connection reset by 

[jira] [Commented] (TS-1330) Logging related segfault in 3.2.0

2013-07-31 Thread Yunkai Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726007#comment-13726007
 ] 

Yunkai Zhang commented on TS-1330:
--

I have tested V2 patch, it fixed this issue completely, it's time to merge it:).

 Logging related segfault in 3.2.0
 -

 Key: TS-1330
 URL: https://issues.apache.org/jira/browse/TS-1330
 Project: Traffic Server
  Issue Type: Bug
  Components: Logging
Affects Versions: 3.2.0
 Environment: ATS 3.2.0 on RHEL 6.2 64-bit
Reporter: David Carlin
Assignee: Yunkai Zhang
Priority: Critical
  Labels: A, crash
 Fix For: 3.3.5

 Attachments: 0001-A-testing-patch-to-verify-my-analysis.patch, 
 0001-TS-1330-Fix-logging-core-at-checkout_write.patch, 
 0001-TS-1330-Fix-logging-core-at-checkout_write.patch.V2


 I observed the following crash once on one of our ATS boxes - possibly 
 related to TS-1240
 Jul  2 13:56:56 l2 traffic_server[25853]: {0x2b0a391e1700} ERROR: 
 [SSL_NetVConnection::ssl_read_from_net] SSL_ERROR_SYSCALL, underlying IO 
 error: Connection reset by peer
 Jul  2 13:59:56 l2 kernel: [ET_NET 1][25855]: segfault at c ip 
 0058e083 sp 2b0a2982b740 error 6
 Jul  2 13:59:56 l2 kernel: [ET_NET 3][25857]: segfault at 84 ip 
 0058e083 sp 2b0a29a31740 error 6 in traffic_server[40+34]
 Jul  2 13:59:56 l2 kernel: in traffic_server[40+34]
 Jul  2 14:02:59 l2 traffic_cop[25901]: (test) read timeout [18 ]
 Jul  2 14:02:59 l2 traffic_cop[25901]: server heartbeat failed [1]
 Jul  2 14:03:08 l2 traffic_manager[25826]: {0x7f3f088607e0} FATAL: 
 [LocalManager::pollMgmtProcessServer] Error in read (errno: 104)
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} FATAL:  (last 
 system error 104: Connection reset by peer)
 Jul  2 14:03:09 l2 traffic_cop[25901]: cannot find traffic_server [1]
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} ERROR: 
 [LocalManager::sendMgmtMsgToProcesses] Error writing message
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} ERROR:  (last 
 system error 32: Broken pipe)
 Jul  2 14:03:12 l2 traffic_cop[25901]: cop received child status signal 
 [25826 35584]
 Jul  2 14:03:12 l2 traffic_cop[25901]: traffic_manager not running, making 
 sure traffic_server is dead
 Jul  2 14:03:12 l2 traffic_cop[25901]: spawning traffic_manager
 Jul  2 14:03:13 l2 traffic_manager[18267]: NOTE: --- Manager Starting ---
 Jul  2 14:03:13 l2 traffic_manager[18267]: NOTE: Manager Version: Apache 
 Traffic Server - traffic_manager - 3.2.0 - (build # 52518 on Jun 25 2012 at 
 18:22:12)
 Jul  2 14:03:13 l2 traffic_manager[18267]: {0x7fe63de3f7e0} STATUS: opened 
 /home/y/logs/trafficserver/manager.log
 Jul  2 14:03:15 l2 traffic_server[18322]: NOTE: --- Server Starting ---
 Jul  2 14:03:15 l2 traffic_server[18322]: NOTE: Server Version: Apache 
 Traffic Server - traffic_server - 3.2.0 - (build # 52518 on Jun 25 2012 at 
 18:22:31)
 Jul  2 14:03:15 l2 traffic_server[18322]: {0x2b77573ab860} STATUS: opened 
 /home/y/logs/trafficserver/diags.log
 Jul  2 14:03:15 l2 traffic_server[18322]: {0x2b77573ab860} ERROR: Cannot 
 insert duplicate!
 Jul  2 14:03:22 l2 traffic_cop[25901]: server heartbeat succeeded
 [Jul  2 13:56:56.304] Server {0x2b0a391e1700} ERROR: 
 [SSL_NetVConnection::ssl_read_from_net] SSL_ERROR_SYSCALL, underlying IO 
 error: Connection reset by peer
 NOTE: Traffic Server received Sig 11: Segmentation fault
 NOTE: Traffic Server received Sig 11: Segmentation fault
 /home/y/bin/traffic_server - STACK TRACE: 
 /home/y/bin/traffic_server - STACK TRACE: 
 /lib64/libpthread.so.0[0x3b54e0f4a0]
 /lib64/libpthread.so.0[0x3b54e0f4a0]
 /home/y/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e083]
 /home/y/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a64c8]
 /home/y/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a7e30]
 /home/y/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e083]
 /home/y/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a64c8]
 /home/y/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a7e30]
 /home/y/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f506]
 /home/y/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526c50]
 /home/y/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f506]
 /home/y/bin/traffic_server(_ZN6HttpSM9kill_thisEv+0x928)[0x52b548]
 /home/y/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526c50]
 /home/y/bin/traffic_server(_ZN6HttpSM9kill_thisEv+0x928)[0x52b548]
 /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0x198)[0x52b868]
 /home/y/bin/traffic_server(_ZN10HttpTunnel12main_handlerEiPv+0xde)[0x56c3ee]
 /home/y/bin/traffic_server[0x6736a1]
 

[jira] [Commented] (TS-1330) Logging related segfault in 3.2.0

2013-07-30 Thread Leif Hedstrom (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723768#comment-13723768
 ] 

Leif Hedstrom commented on TS-1330:
---

Yunkai: lets move the discussion here. Can we expect a patch before friday ?

 Logging related segfault in 3.2.0
 -

 Key: TS-1330
 URL: https://issues.apache.org/jira/browse/TS-1330
 Project: Traffic Server
  Issue Type: Bug
  Components: Logging
Affects Versions: 3.2.0
 Environment: ATS 3.2.0 on RHEL 6.2 64-bit
Reporter: David Carlin
Assignee: Bryan Call
Priority: Critical
  Labels: A, crash
 Fix For: 3.5.0


 I observed the following crash once on one of our ATS boxes - possibly 
 related to TS-1240
 Jul  2 13:56:56 l2 traffic_server[25853]: {0x2b0a391e1700} ERROR: 
 [SSL_NetVConnection::ssl_read_from_net] SSL_ERROR_SYSCALL, underlying IO 
 error: Connection reset by peer
 Jul  2 13:59:56 l2 kernel: [ET_NET 1][25855]: segfault at c ip 
 0058e083 sp 2b0a2982b740 error 6
 Jul  2 13:59:56 l2 kernel: [ET_NET 3][25857]: segfault at 84 ip 
 0058e083 sp 2b0a29a31740 error 6 in traffic_server[40+34]
 Jul  2 13:59:56 l2 kernel: in traffic_server[40+34]
 Jul  2 14:02:59 l2 traffic_cop[25901]: (test) read timeout [18 ]
 Jul  2 14:02:59 l2 traffic_cop[25901]: server heartbeat failed [1]
 Jul  2 14:03:08 l2 traffic_manager[25826]: {0x7f3f088607e0} FATAL: 
 [LocalManager::pollMgmtProcessServer] Error in read (errno: 104)
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} FATAL:  (last 
 system error 104: Connection reset by peer)
 Jul  2 14:03:09 l2 traffic_cop[25901]: cannot find traffic_server [1]
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} ERROR: 
 [LocalManager::sendMgmtMsgToProcesses] Error writing message
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} ERROR:  (last 
 system error 32: Broken pipe)
 Jul  2 14:03:12 l2 traffic_cop[25901]: cop received child status signal 
 [25826 35584]
 Jul  2 14:03:12 l2 traffic_cop[25901]: traffic_manager not running, making 
 sure traffic_server is dead
 Jul  2 14:03:12 l2 traffic_cop[25901]: spawning traffic_manager
 Jul  2 14:03:13 l2 traffic_manager[18267]: NOTE: --- Manager Starting ---
 Jul  2 14:03:13 l2 traffic_manager[18267]: NOTE: Manager Version: Apache 
 Traffic Server - traffic_manager - 3.2.0 - (build # 52518 on Jun 25 2012 at 
 18:22:12)
 Jul  2 14:03:13 l2 traffic_manager[18267]: {0x7fe63de3f7e0} STATUS: opened 
 /home/y/logs/trafficserver/manager.log
 Jul  2 14:03:15 l2 traffic_server[18322]: NOTE: --- Server Starting ---
 Jul  2 14:03:15 l2 traffic_server[18322]: NOTE: Server Version: Apache 
 Traffic Server - traffic_server - 3.2.0 - (build # 52518 on Jun 25 2012 at 
 18:22:31)
 Jul  2 14:03:15 l2 traffic_server[18322]: {0x2b77573ab860} STATUS: opened 
 /home/y/logs/trafficserver/diags.log
 Jul  2 14:03:15 l2 traffic_server[18322]: {0x2b77573ab860} ERROR: Cannot 
 insert duplicate!
 Jul  2 14:03:22 l2 traffic_cop[25901]: server heartbeat succeeded
 [Jul  2 13:56:56.304] Server {0x2b0a391e1700} ERROR: 
 [SSL_NetVConnection::ssl_read_from_net] SSL_ERROR_SYSCALL, underlying IO 
 error: Connection reset by peer
 NOTE: Traffic Server received Sig 11: Segmentation fault
 NOTE: Traffic Server received Sig 11: Segmentation fault
 /home/y/bin/traffic_server - STACK TRACE: 
 /home/y/bin/traffic_server - STACK TRACE: 
 /lib64/libpthread.so.0[0x3b54e0f4a0]
 /lib64/libpthread.so.0[0x3b54e0f4a0]
 /home/y/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e083]
 /home/y/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a64c8]
 /home/y/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a7e30]
 /home/y/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e083]
 /home/y/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a64c8]
 /home/y/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a7e30]
 /home/y/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f506]
 /home/y/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526c50]
 /home/y/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f506]
 /home/y/bin/traffic_server(_ZN6HttpSM9kill_thisEv+0x928)[0x52b548]
 /home/y/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526c50]
 /home/y/bin/traffic_server(_ZN6HttpSM9kill_thisEv+0x928)[0x52b548]
 /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0x198)[0x52b868]
 /home/y/bin/traffic_server(_ZN10HttpTunnel12main_handlerEiPv+0xde)[0x56c3ee]
 /home/y/bin/traffic_server[0x6736a1]
 /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0x198)[0x52b868]
 /home/y/bin/traffic_server(_Z15write_to_net_ioP10NetHandlerP18UnixNetVConnectionP7EThread+0x847)[0x675517]
 

[jira] [Commented] (TS-1330) Logging related segfault in 3.2.0

2013-07-30 Thread Yunkai Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723775#comment-13723775
 ] 

Yunkai Zhang commented on TS-1330:
--

[~zwoop]

Yes, maybe tonight:)

 Logging related segfault in 3.2.0
 -

 Key: TS-1330
 URL: https://issues.apache.org/jira/browse/TS-1330
 Project: Traffic Server
  Issue Type: Bug
  Components: Logging
Affects Versions: 3.2.0
 Environment: ATS 3.2.0 on RHEL 6.2 64-bit
Reporter: David Carlin
Assignee: Bryan Call
Priority: Critical
  Labels: A, crash
 Fix For: 3.5.0


 I observed the following crash once on one of our ATS boxes - possibly 
 related to TS-1240
 Jul  2 13:56:56 l2 traffic_server[25853]: {0x2b0a391e1700} ERROR: 
 [SSL_NetVConnection::ssl_read_from_net] SSL_ERROR_SYSCALL, underlying IO 
 error: Connection reset by peer
 Jul  2 13:59:56 l2 kernel: [ET_NET 1][25855]: segfault at c ip 
 0058e083 sp 2b0a2982b740 error 6
 Jul  2 13:59:56 l2 kernel: [ET_NET 3][25857]: segfault at 84 ip 
 0058e083 sp 2b0a29a31740 error 6 in traffic_server[40+34]
 Jul  2 13:59:56 l2 kernel: in traffic_server[40+34]
 Jul  2 14:02:59 l2 traffic_cop[25901]: (test) read timeout [18 ]
 Jul  2 14:02:59 l2 traffic_cop[25901]: server heartbeat failed [1]
 Jul  2 14:03:08 l2 traffic_manager[25826]: {0x7f3f088607e0} FATAL: 
 [LocalManager::pollMgmtProcessServer] Error in read (errno: 104)
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} FATAL:  (last 
 system error 104: Connection reset by peer)
 Jul  2 14:03:09 l2 traffic_cop[25901]: cannot find traffic_server [1]
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} ERROR: 
 [LocalManager::sendMgmtMsgToProcesses] Error writing message
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} ERROR:  (last 
 system error 32: Broken pipe)
 Jul  2 14:03:12 l2 traffic_cop[25901]: cop received child status signal 
 [25826 35584]
 Jul  2 14:03:12 l2 traffic_cop[25901]: traffic_manager not running, making 
 sure traffic_server is dead
 Jul  2 14:03:12 l2 traffic_cop[25901]: spawning traffic_manager
 Jul  2 14:03:13 l2 traffic_manager[18267]: NOTE: --- Manager Starting ---
 Jul  2 14:03:13 l2 traffic_manager[18267]: NOTE: Manager Version: Apache 
 Traffic Server - traffic_manager - 3.2.0 - (build # 52518 on Jun 25 2012 at 
 18:22:12)
 Jul  2 14:03:13 l2 traffic_manager[18267]: {0x7fe63de3f7e0} STATUS: opened 
 /home/y/logs/trafficserver/manager.log
 Jul  2 14:03:15 l2 traffic_server[18322]: NOTE: --- Server Starting ---
 Jul  2 14:03:15 l2 traffic_server[18322]: NOTE: Server Version: Apache 
 Traffic Server - traffic_server - 3.2.0 - (build # 52518 on Jun 25 2012 at 
 18:22:31)
 Jul  2 14:03:15 l2 traffic_server[18322]: {0x2b77573ab860} STATUS: opened 
 /home/y/logs/trafficserver/diags.log
 Jul  2 14:03:15 l2 traffic_server[18322]: {0x2b77573ab860} ERROR: Cannot 
 insert duplicate!
 Jul  2 14:03:22 l2 traffic_cop[25901]: server heartbeat succeeded
 [Jul  2 13:56:56.304] Server {0x2b0a391e1700} ERROR: 
 [SSL_NetVConnection::ssl_read_from_net] SSL_ERROR_SYSCALL, underlying IO 
 error: Connection reset by peer
 NOTE: Traffic Server received Sig 11: Segmentation fault
 NOTE: Traffic Server received Sig 11: Segmentation fault
 /home/y/bin/traffic_server - STACK TRACE: 
 /home/y/bin/traffic_server - STACK TRACE: 
 /lib64/libpthread.so.0[0x3b54e0f4a0]
 /lib64/libpthread.so.0[0x3b54e0f4a0]
 /home/y/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e083]
 /home/y/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a64c8]
 /home/y/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a7e30]
 /home/y/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e083]
 /home/y/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a64c8]
 /home/y/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a7e30]
 /home/y/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f506]
 /home/y/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526c50]
 /home/y/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f506]
 /home/y/bin/traffic_server(_ZN6HttpSM9kill_thisEv+0x928)[0x52b548]
 /home/y/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526c50]
 /home/y/bin/traffic_server(_ZN6HttpSM9kill_thisEv+0x928)[0x52b548]
 /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0x198)[0x52b868]
 /home/y/bin/traffic_server(_ZN10HttpTunnel12main_handlerEiPv+0xde)[0x56c3ee]
 /home/y/bin/traffic_server[0x6736a1]
 /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0x198)[0x52b868]
 /home/y/bin/traffic_server(_Z15write_to_net_ioP10NetHandlerP18UnixNetVConnectionP7EThread+0x847)[0x675517]
 /home/y/bin/traffic_server[0x672f81]
 

[jira] [Commented] (TS-1330) Logging related segfault in 3.2.0

2013-07-30 Thread Yunkai Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13723962#comment-13723962
 ] 

Yunkai Zhang commented on TS-1330:
--

I have found the root cause of this issue, here is my analysis:

1. One thread(A) got an LogBuffer object in LogObject::_checkout_write() and 
ran into LogBuffer::checkout_write(), before it updated the LogBuffer::m_state, 
it was scheduled out by the kernel.

2. Other threads got the same LogBuffer object, but they *all* finished 
_checkout_write() codes and made the LogBuffer object become *full*, finally 
one of these thread flushed the LogBuffer object into write_list queue.

3. The LogBuffer object's m_references == 0 at this time.

4. The flushing thread would call LogBufferManager::flush_buffers() to loop 
write_list. As the LogBuffer's m_references == 0, so it would be flushed and 
*deleted*.

5. Now, thread(A) would be scheduled in by the kernel, but the LogBuffer object 
has been *deleted*, so the m_buffer == NULL, that is why entry_header is 0x0.

Here is a coredump generated by the my testing patch - 
0001-A-testing-patch-to-verify-my-analysis.patch:
{code}
(gdb) bt
#0  0x003c75232885 in raise () from /lib64/libc.so.6
#1  0x003c75234065 in abort () from /lib64/libc.so.6
#2  0x2b02d24bdaf8 in ink_die_die_die (retval=1) at ink_error.cc:43
#3  0x2b02d24bdbca in ink_fatal_va(int, const char *, typedef __va_list_tag 
__va_list_tag *) (return_code=1,
message_format=0x2b02d24dbe40 %s:%d: failed assert `%s`, 
ap=0x2b0304211a90) at ink_error.cc:65
#4  0x2b02d24bdc95 in ink_fatal (return_code=1, 
message_format=0x2b02d24dbe40 %s:%d: failed assert `%s`)
at ink_error.cc:73
#5  0x2b02d24bcbc4 in _ink_assert (expression=0x74e021 current_ref == 0, 
file=0x74e014 LogObject.cc, line=60)
at ink_assert.cc:38
#6  0x0060a9ab in LogBufferManager::flush_buffers (this=0x3219a08, 
sink=0x32199a0) at LogObject.cc:60
#7  0x005f755e in LogObject::flush_buffers (this=0x3219970) at 
LogObject.h:126
#8  0x0060de55 in LogObjectManager::flush_buffers (this=0x31e84f0) at 
LogObject.cc:1185
#9  0x005f6627 in Log::flush_thread_main (args=0x0) at Log.cc:1193
#10 0x005f779b in LoggingFlushContinuation::mainEvent (this=0x31e8fb0, 
event=1, data=0x2b02d8e13c60) at Log.cc:297
#11 0x004f1f1e in Continuation::handleEvent (this=0x31e8fb0, event=1, 
data=0x2b02d8e13c60)
at ../iocore/eventsystem/I_Continuation.h:146
#12 0x0070fb18 in EThread::execute (this=0x2b0304013010) at 
UnixEThread.cc:291
#13 0x0070e6df in spawn_thread_internal (a=0x31e83d0) at Thread.cc:88
#14 0x003c756077f1 in start_thread () from /lib64/libpthread.so.0
#15 0x003c752e570d in clone () from /lib64/libc.so.6
(gdb) f 6
#6  0x0060a9ab in LogBufferManager::flush_buffers (this=0x3219a08, 
sink=0x32199a0) at LogObject.cc:60
60ink_release_assert(current_ref == 0);
(gdb) l
55 * m_ref_debug could be changed by other threads
56 * currently, let's store it into local variable
57 * which is helpful for debugging.
58 */
59int current_ref = b-m_ref_debug;
60ink_release_assert(current_ref == 0);
61new_q.push(b);
62  }
63}
64
(gdb) p current_ref
$2 = 2
(gdb) p b-m_references 
$3 = 0
(gdb) 
{code}

Once we found out the root cause, fix is simple. We should refine the 
LogBuffer::m_references variaable, so that it can *really* prevent outstanding 
references.

 Logging related segfault in 3.2.0
 -

 Key: TS-1330
 URL: https://issues.apache.org/jira/browse/TS-1330
 Project: Traffic Server
  Issue Type: Bug
  Components: Logging
Affects Versions: 3.2.0
 Environment: ATS 3.2.0 on RHEL 6.2 64-bit
Reporter: David Carlin
Assignee: Bryan Call
Priority: Critical
  Labels: A, crash
 Fix For: 3.5.0


 I observed the following crash once on one of our ATS boxes - possibly 
 related to TS-1240
 Jul  2 13:56:56 l2 traffic_server[25853]: {0x2b0a391e1700} ERROR: 
 [SSL_NetVConnection::ssl_read_from_net] SSL_ERROR_SYSCALL, underlying IO 
 error: Connection reset by peer
 Jul  2 13:59:56 l2 kernel: [ET_NET 1][25855]: segfault at c ip 
 0058e083 sp 2b0a2982b740 error 6
 Jul  2 13:59:56 l2 kernel: [ET_NET 3][25857]: segfault at 84 ip 
 0058e083 sp 2b0a29a31740 error 6 in traffic_server[40+34]
 Jul  2 13:59:56 l2 kernel: in traffic_server[40+34]
 Jul  2 14:02:59 l2 traffic_cop[25901]: (test) read timeout [18 ]
 Jul  2 14:02:59 l2 traffic_cop[25901]: server heartbeat failed [1]
 Jul  2 14:03:08 l2 traffic_manager[25826]: {0x7f3f088607e0} FATAL: 
 [LocalManager::pollMgmtProcessServer] Error in read (errno: 104)
 Jul  2 

[jira] [Commented] (TS-1330) Logging related segfault in 3.2.0

2013-07-30 Thread Yunkai Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724132#comment-13724132
 ] 

Yunkai Zhang commented on TS-1330:
--

I have submitted a patch to fix this issue:

https://issues.apache.org/jira/secure/attachment/12594993/0001-TS-1330-Fix-logging-core-at-checkout_write.patch

Please test it, any feedback is welcome.

 Logging related segfault in 3.2.0
 -

 Key: TS-1330
 URL: https://issues.apache.org/jira/browse/TS-1330
 Project: Traffic Server
  Issue Type: Bug
  Components: Logging
Affects Versions: 3.2.0
 Environment: ATS 3.2.0 on RHEL 6.2 64-bit
Reporter: David Carlin
Assignee: Bryan Call
Priority: Critical
  Labels: A, crash
 Fix For: 3.5.0

 Attachments: 0001-A-testing-patch-to-verify-my-analysis.patch, 
 0001-TS-1330-Fix-logging-core-at-checkout_write.patch


 I observed the following crash once on one of our ATS boxes - possibly 
 related to TS-1240
 Jul  2 13:56:56 l2 traffic_server[25853]: {0x2b0a391e1700} ERROR: 
 [SSL_NetVConnection::ssl_read_from_net] SSL_ERROR_SYSCALL, underlying IO 
 error: Connection reset by peer
 Jul  2 13:59:56 l2 kernel: [ET_NET 1][25855]: segfault at c ip 
 0058e083 sp 2b0a2982b740 error 6
 Jul  2 13:59:56 l2 kernel: [ET_NET 3][25857]: segfault at 84 ip 
 0058e083 sp 2b0a29a31740 error 6 in traffic_server[40+34]
 Jul  2 13:59:56 l2 kernel: in traffic_server[40+34]
 Jul  2 14:02:59 l2 traffic_cop[25901]: (test) read timeout [18 ]
 Jul  2 14:02:59 l2 traffic_cop[25901]: server heartbeat failed [1]
 Jul  2 14:03:08 l2 traffic_manager[25826]: {0x7f3f088607e0} FATAL: 
 [LocalManager::pollMgmtProcessServer] Error in read (errno: 104)
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} FATAL:  (last 
 system error 104: Connection reset by peer)
 Jul  2 14:03:09 l2 traffic_cop[25901]: cannot find traffic_server [1]
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} ERROR: 
 [LocalManager::sendMgmtMsgToProcesses] Error writing message
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} ERROR:  (last 
 system error 32: Broken pipe)
 Jul  2 14:03:12 l2 traffic_cop[25901]: cop received child status signal 
 [25826 35584]
 Jul  2 14:03:12 l2 traffic_cop[25901]: traffic_manager not running, making 
 sure traffic_server is dead
 Jul  2 14:03:12 l2 traffic_cop[25901]: spawning traffic_manager
 Jul  2 14:03:13 l2 traffic_manager[18267]: NOTE: --- Manager Starting ---
 Jul  2 14:03:13 l2 traffic_manager[18267]: NOTE: Manager Version: Apache 
 Traffic Server - traffic_manager - 3.2.0 - (build # 52518 on Jun 25 2012 at 
 18:22:12)
 Jul  2 14:03:13 l2 traffic_manager[18267]: {0x7fe63de3f7e0} STATUS: opened 
 /home/y/logs/trafficserver/manager.log
 Jul  2 14:03:15 l2 traffic_server[18322]: NOTE: --- Server Starting ---
 Jul  2 14:03:15 l2 traffic_server[18322]: NOTE: Server Version: Apache 
 Traffic Server - traffic_server - 3.2.0 - (build # 52518 on Jun 25 2012 at 
 18:22:31)
 Jul  2 14:03:15 l2 traffic_server[18322]: {0x2b77573ab860} STATUS: opened 
 /home/y/logs/trafficserver/diags.log
 Jul  2 14:03:15 l2 traffic_server[18322]: {0x2b77573ab860} ERROR: Cannot 
 insert duplicate!
 Jul  2 14:03:22 l2 traffic_cop[25901]: server heartbeat succeeded
 [Jul  2 13:56:56.304] Server {0x2b0a391e1700} ERROR: 
 [SSL_NetVConnection::ssl_read_from_net] SSL_ERROR_SYSCALL, underlying IO 
 error: Connection reset by peer
 NOTE: Traffic Server received Sig 11: Segmentation fault
 NOTE: Traffic Server received Sig 11: Segmentation fault
 /home/y/bin/traffic_server - STACK TRACE: 
 /home/y/bin/traffic_server - STACK TRACE: 
 /lib64/libpthread.so.0[0x3b54e0f4a0]
 /lib64/libpthread.so.0[0x3b54e0f4a0]
 /home/y/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e083]
 /home/y/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a64c8]
 /home/y/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a7e30]
 /home/y/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e083]
 /home/y/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a64c8]
 /home/y/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a7e30]
 /home/y/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f506]
 /home/y/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526c50]
 /home/y/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f506]
 /home/y/bin/traffic_server(_ZN6HttpSM9kill_thisEv+0x928)[0x52b548]
 /home/y/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526c50]
 /home/y/bin/traffic_server(_ZN6HttpSM9kill_thisEv+0x928)[0x52b548]
 /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0x198)[0x52b868]
 /home/y/bin/traffic_server(_ZN10HttpTunnel12main_handlerEiPv+0xde)[0x56c3ee]
 

[jira] [Commented] (TS-1330) Logging related segfault in 3.2.0

2013-07-30 Thread Yunkai Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724730#comment-13724730
 ] 

Yunkai Zhang commented on TS-1330:
--

I'm cooking V2 of this patch as I found other problems about logging.

Please wait ...

 Logging related segfault in 3.2.0
 -

 Key: TS-1330
 URL: https://issues.apache.org/jira/browse/TS-1330
 Project: Traffic Server
  Issue Type: Bug
  Components: Logging
Affects Versions: 3.2.0
 Environment: ATS 3.2.0 on RHEL 6.2 64-bit
Reporter: David Carlin
Assignee: Bryan Call
Priority: Critical
  Labels: A, crash
 Fix For: 3.5.0

 Attachments: 0001-A-testing-patch-to-verify-my-analysis.patch, 
 0001-TS-1330-Fix-logging-core-at-checkout_write.patch


 I observed the following crash once on one of our ATS boxes - possibly 
 related to TS-1240
 Jul  2 13:56:56 l2 traffic_server[25853]: {0x2b0a391e1700} ERROR: 
 [SSL_NetVConnection::ssl_read_from_net] SSL_ERROR_SYSCALL, underlying IO 
 error: Connection reset by peer
 Jul  2 13:59:56 l2 kernel: [ET_NET 1][25855]: segfault at c ip 
 0058e083 sp 2b0a2982b740 error 6
 Jul  2 13:59:56 l2 kernel: [ET_NET 3][25857]: segfault at 84 ip 
 0058e083 sp 2b0a29a31740 error 6 in traffic_server[40+34]
 Jul  2 13:59:56 l2 kernel: in traffic_server[40+34]
 Jul  2 14:02:59 l2 traffic_cop[25901]: (test) read timeout [18 ]
 Jul  2 14:02:59 l2 traffic_cop[25901]: server heartbeat failed [1]
 Jul  2 14:03:08 l2 traffic_manager[25826]: {0x7f3f088607e0} FATAL: 
 [LocalManager::pollMgmtProcessServer] Error in read (errno: 104)
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} FATAL:  (last 
 system error 104: Connection reset by peer)
 Jul  2 14:03:09 l2 traffic_cop[25901]: cannot find traffic_server [1]
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} ERROR: 
 [LocalManager::sendMgmtMsgToProcesses] Error writing message
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} ERROR:  (last 
 system error 32: Broken pipe)
 Jul  2 14:03:12 l2 traffic_cop[25901]: cop received child status signal 
 [25826 35584]
 Jul  2 14:03:12 l2 traffic_cop[25901]: traffic_manager not running, making 
 sure traffic_server is dead
 Jul  2 14:03:12 l2 traffic_cop[25901]: spawning traffic_manager
 Jul  2 14:03:13 l2 traffic_manager[18267]: NOTE: --- Manager Starting ---
 Jul  2 14:03:13 l2 traffic_manager[18267]: NOTE: Manager Version: Apache 
 Traffic Server - traffic_manager - 3.2.0 - (build # 52518 on Jun 25 2012 at 
 18:22:12)
 Jul  2 14:03:13 l2 traffic_manager[18267]: {0x7fe63de3f7e0} STATUS: opened 
 /home/y/logs/trafficserver/manager.log
 Jul  2 14:03:15 l2 traffic_server[18322]: NOTE: --- Server Starting ---
 Jul  2 14:03:15 l2 traffic_server[18322]: NOTE: Server Version: Apache 
 Traffic Server - traffic_server - 3.2.0 - (build # 52518 on Jun 25 2012 at 
 18:22:31)
 Jul  2 14:03:15 l2 traffic_server[18322]: {0x2b77573ab860} STATUS: opened 
 /home/y/logs/trafficserver/diags.log
 Jul  2 14:03:15 l2 traffic_server[18322]: {0x2b77573ab860} ERROR: Cannot 
 insert duplicate!
 Jul  2 14:03:22 l2 traffic_cop[25901]: server heartbeat succeeded
 [Jul  2 13:56:56.304] Server {0x2b0a391e1700} ERROR: 
 [SSL_NetVConnection::ssl_read_from_net] SSL_ERROR_SYSCALL, underlying IO 
 error: Connection reset by peer
 NOTE: Traffic Server received Sig 11: Segmentation fault
 NOTE: Traffic Server received Sig 11: Segmentation fault
 /home/y/bin/traffic_server - STACK TRACE: 
 /home/y/bin/traffic_server - STACK TRACE: 
 /lib64/libpthread.so.0[0x3b54e0f4a0]
 /lib64/libpthread.so.0[0x3b54e0f4a0]
 /home/y/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e083]
 /home/y/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a64c8]
 /home/y/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a7e30]
 /home/y/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e083]
 /home/y/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a64c8]
 /home/y/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a7e30]
 /home/y/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f506]
 /home/y/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526c50]
 /home/y/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f506]
 /home/y/bin/traffic_server(_ZN6HttpSM9kill_thisEv+0x928)[0x52b548]
 /home/y/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526c50]
 /home/y/bin/traffic_server(_ZN6HttpSM9kill_thisEv+0x928)[0x52b548]
 /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0x198)[0x52b868]
 /home/y/bin/traffic_server(_ZN10HttpTunnel12main_handlerEiPv+0xde)[0x56c3ee]
 /home/y/bin/traffic_server[0x6736a1]
 /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0x198)[0x52b868]
 

[jira] [Commented] (TS-1330) Logging related segfault in 3.2.0

2013-07-30 Thread James Peach (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724770#comment-13724770
 ] 

James Peach commented on TS-1330:
-

Is m_references enforcing the lifetime of the LogBuffer object? If so, can we 
use Ptr for that?

 Logging related segfault in 3.2.0
 -

 Key: TS-1330
 URL: https://issues.apache.org/jira/browse/TS-1330
 Project: Traffic Server
  Issue Type: Bug
  Components: Logging
Affects Versions: 3.2.0
 Environment: ATS 3.2.0 on RHEL 6.2 64-bit
Reporter: David Carlin
Assignee: Bryan Call
Priority: Critical
  Labels: A, crash
 Fix For: 3.5.0

 Attachments: 0001-A-testing-patch-to-verify-my-analysis.patch, 
 0001-TS-1330-Fix-logging-core-at-checkout_write.patch


 I observed the following crash once on one of our ATS boxes - possibly 
 related to TS-1240
 Jul  2 13:56:56 l2 traffic_server[25853]: {0x2b0a391e1700} ERROR: 
 [SSL_NetVConnection::ssl_read_from_net] SSL_ERROR_SYSCALL, underlying IO 
 error: Connection reset by peer
 Jul  2 13:59:56 l2 kernel: [ET_NET 1][25855]: segfault at c ip 
 0058e083 sp 2b0a2982b740 error 6
 Jul  2 13:59:56 l2 kernel: [ET_NET 3][25857]: segfault at 84 ip 
 0058e083 sp 2b0a29a31740 error 6 in traffic_server[40+34]
 Jul  2 13:59:56 l2 kernel: in traffic_server[40+34]
 Jul  2 14:02:59 l2 traffic_cop[25901]: (test) read timeout [18 ]
 Jul  2 14:02:59 l2 traffic_cop[25901]: server heartbeat failed [1]
 Jul  2 14:03:08 l2 traffic_manager[25826]: {0x7f3f088607e0} FATAL: 
 [LocalManager::pollMgmtProcessServer] Error in read (errno: 104)
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} FATAL:  (last 
 system error 104: Connection reset by peer)
 Jul  2 14:03:09 l2 traffic_cop[25901]: cannot find traffic_server [1]
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} ERROR: 
 [LocalManager::sendMgmtMsgToProcesses] Error writing message
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} ERROR:  (last 
 system error 32: Broken pipe)
 Jul  2 14:03:12 l2 traffic_cop[25901]: cop received child status signal 
 [25826 35584]
 Jul  2 14:03:12 l2 traffic_cop[25901]: traffic_manager not running, making 
 sure traffic_server is dead
 Jul  2 14:03:12 l2 traffic_cop[25901]: spawning traffic_manager
 Jul  2 14:03:13 l2 traffic_manager[18267]: NOTE: --- Manager Starting ---
 Jul  2 14:03:13 l2 traffic_manager[18267]: NOTE: Manager Version: Apache 
 Traffic Server - traffic_manager - 3.2.0 - (build # 52518 on Jun 25 2012 at 
 18:22:12)
 Jul  2 14:03:13 l2 traffic_manager[18267]: {0x7fe63de3f7e0} STATUS: opened 
 /home/y/logs/trafficserver/manager.log
 Jul  2 14:03:15 l2 traffic_server[18322]: NOTE: --- Server Starting ---
 Jul  2 14:03:15 l2 traffic_server[18322]: NOTE: Server Version: Apache 
 Traffic Server - traffic_server - 3.2.0 - (build # 52518 on Jun 25 2012 at 
 18:22:31)
 Jul  2 14:03:15 l2 traffic_server[18322]: {0x2b77573ab860} STATUS: opened 
 /home/y/logs/trafficserver/diags.log
 Jul  2 14:03:15 l2 traffic_server[18322]: {0x2b77573ab860} ERROR: Cannot 
 insert duplicate!
 Jul  2 14:03:22 l2 traffic_cop[25901]: server heartbeat succeeded
 [Jul  2 13:56:56.304] Server {0x2b0a391e1700} ERROR: 
 [SSL_NetVConnection::ssl_read_from_net] SSL_ERROR_SYSCALL, underlying IO 
 error: Connection reset by peer
 NOTE: Traffic Server received Sig 11: Segmentation fault
 NOTE: Traffic Server received Sig 11: Segmentation fault
 /home/y/bin/traffic_server - STACK TRACE: 
 /home/y/bin/traffic_server - STACK TRACE: 
 /lib64/libpthread.so.0[0x3b54e0f4a0]
 /lib64/libpthread.so.0[0x3b54e0f4a0]
 /home/y/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e083]
 /home/y/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a64c8]
 /home/y/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a7e30]
 /home/y/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e083]
 /home/y/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a64c8]
 /home/y/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a7e30]
 /home/y/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f506]
 /home/y/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526c50]
 /home/y/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f506]
 /home/y/bin/traffic_server(_ZN6HttpSM9kill_thisEv+0x928)[0x52b548]
 /home/y/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526c50]
 /home/y/bin/traffic_server(_ZN6HttpSM9kill_thisEv+0x928)[0x52b548]
 /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0x198)[0x52b868]
 /home/y/bin/traffic_server(_ZN10HttpTunnel12main_handlerEiPv+0xde)[0x56c3ee]
 /home/y/bin/traffic_server[0x6736a1]
 /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0x198)[0x52b868]

[jira] [Commented] (TS-1330) Logging related segfault in 3.2.0

2013-07-30 Thread Yunkai Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13724782#comment-13724782
 ] 

Yunkai Zhang commented on TS-1330:
--

No, m_references only enforces the lifetime of getting offset from LogBuffer, 
that is what checkout_write() do.

 Logging related segfault in 3.2.0
 -

 Key: TS-1330
 URL: https://issues.apache.org/jira/browse/TS-1330
 Project: Traffic Server
  Issue Type: Bug
  Components: Logging
Affects Versions: 3.2.0
 Environment: ATS 3.2.0 on RHEL 6.2 64-bit
Reporter: David Carlin
Assignee: Bryan Call
Priority: Critical
  Labels: A, crash
 Fix For: 3.5.0

 Attachments: 0001-A-testing-patch-to-verify-my-analysis.patch, 
 0001-TS-1330-Fix-logging-core-at-checkout_write.patch


 I observed the following crash once on one of our ATS boxes - possibly 
 related to TS-1240
 Jul  2 13:56:56 l2 traffic_server[25853]: {0x2b0a391e1700} ERROR: 
 [SSL_NetVConnection::ssl_read_from_net] SSL_ERROR_SYSCALL, underlying IO 
 error: Connection reset by peer
 Jul  2 13:59:56 l2 kernel: [ET_NET 1][25855]: segfault at c ip 
 0058e083 sp 2b0a2982b740 error 6
 Jul  2 13:59:56 l2 kernel: [ET_NET 3][25857]: segfault at 84 ip 
 0058e083 sp 2b0a29a31740 error 6 in traffic_server[40+34]
 Jul  2 13:59:56 l2 kernel: in traffic_server[40+34]
 Jul  2 14:02:59 l2 traffic_cop[25901]: (test) read timeout [18 ]
 Jul  2 14:02:59 l2 traffic_cop[25901]: server heartbeat failed [1]
 Jul  2 14:03:08 l2 traffic_manager[25826]: {0x7f3f088607e0} FATAL: 
 [LocalManager::pollMgmtProcessServer] Error in read (errno: 104)
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} FATAL:  (last 
 system error 104: Connection reset by peer)
 Jul  2 14:03:09 l2 traffic_cop[25901]: cannot find traffic_server [1]
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} ERROR: 
 [LocalManager::sendMgmtMsgToProcesses] Error writing message
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} ERROR:  (last 
 system error 32: Broken pipe)
 Jul  2 14:03:12 l2 traffic_cop[25901]: cop received child status signal 
 [25826 35584]
 Jul  2 14:03:12 l2 traffic_cop[25901]: traffic_manager not running, making 
 sure traffic_server is dead
 Jul  2 14:03:12 l2 traffic_cop[25901]: spawning traffic_manager
 Jul  2 14:03:13 l2 traffic_manager[18267]: NOTE: --- Manager Starting ---
 Jul  2 14:03:13 l2 traffic_manager[18267]: NOTE: Manager Version: Apache 
 Traffic Server - traffic_manager - 3.2.0 - (build # 52518 on Jun 25 2012 at 
 18:22:12)
 Jul  2 14:03:13 l2 traffic_manager[18267]: {0x7fe63de3f7e0} STATUS: opened 
 /home/y/logs/trafficserver/manager.log
 Jul  2 14:03:15 l2 traffic_server[18322]: NOTE: --- Server Starting ---
 Jul  2 14:03:15 l2 traffic_server[18322]: NOTE: Server Version: Apache 
 Traffic Server - traffic_server - 3.2.0 - (build # 52518 on Jun 25 2012 at 
 18:22:31)
 Jul  2 14:03:15 l2 traffic_server[18322]: {0x2b77573ab860} STATUS: opened 
 /home/y/logs/trafficserver/diags.log
 Jul  2 14:03:15 l2 traffic_server[18322]: {0x2b77573ab860} ERROR: Cannot 
 insert duplicate!
 Jul  2 14:03:22 l2 traffic_cop[25901]: server heartbeat succeeded
 [Jul  2 13:56:56.304] Server {0x2b0a391e1700} ERROR: 
 [SSL_NetVConnection::ssl_read_from_net] SSL_ERROR_SYSCALL, underlying IO 
 error: Connection reset by peer
 NOTE: Traffic Server received Sig 11: Segmentation fault
 NOTE: Traffic Server received Sig 11: Segmentation fault
 /home/y/bin/traffic_server - STACK TRACE: 
 /home/y/bin/traffic_server - STACK TRACE: 
 /lib64/libpthread.so.0[0x3b54e0f4a0]
 /lib64/libpthread.so.0[0x3b54e0f4a0]
 /home/y/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e083]
 /home/y/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a64c8]
 /home/y/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a7e30]
 /home/y/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e083]
 /home/y/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a64c8]
 /home/y/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a7e30]
 /home/y/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f506]
 /home/y/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526c50]
 /home/y/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f506]
 /home/y/bin/traffic_server(_ZN6HttpSM9kill_thisEv+0x928)[0x52b548]
 /home/y/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526c50]
 /home/y/bin/traffic_server(_ZN6HttpSM9kill_thisEv+0x928)[0x52b548]
 /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0x198)[0x52b868]
 /home/y/bin/traffic_server(_ZN10HttpTunnel12main_handlerEiPv+0xde)[0x56c3ee]
 /home/y/bin/traffic_server[0x6736a1]
 

[jira] [Commented] (TS-1330) Logging related segfault in 3.2.0

2013-07-25 Thread Leif Hedstrom (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13720380#comment-13720380
 ] 

Leif Hedstrom commented on TS-1330:
---

Yunkai: Can you comment your findings on this ticket as well? Particularly if 
you have any details on how to reproduce it...


 Logging related segfault in 3.2.0
 -

 Key: TS-1330
 URL: https://issues.apache.org/jira/browse/TS-1330
 Project: Traffic Server
  Issue Type: Bug
  Components: Logging
Affects Versions: 3.2.0
 Environment: ATS 3.2.0 on RHEL 6.2 64-bit
Reporter: David Carlin
Assignee: Bryan Call
Priority: Critical
  Labels: A, crash
 Fix For: 3.5.0


 I observed the following crash once on one of our ATS boxes - possibly 
 related to TS-1240
 Jul  2 13:56:56 l2 traffic_server[25853]: {0x2b0a391e1700} ERROR: 
 [SSL_NetVConnection::ssl_read_from_net] SSL_ERROR_SYSCALL, underlying IO 
 error: Connection reset by peer
 Jul  2 13:59:56 l2 kernel: [ET_NET 1][25855]: segfault at c ip 
 0058e083 sp 2b0a2982b740 error 6
 Jul  2 13:59:56 l2 kernel: [ET_NET 3][25857]: segfault at 84 ip 
 0058e083 sp 2b0a29a31740 error 6 in traffic_server[40+34]
 Jul  2 13:59:56 l2 kernel: in traffic_server[40+34]
 Jul  2 14:02:59 l2 traffic_cop[25901]: (test) read timeout [18 ]
 Jul  2 14:02:59 l2 traffic_cop[25901]: server heartbeat failed [1]
 Jul  2 14:03:08 l2 traffic_manager[25826]: {0x7f3f088607e0} FATAL: 
 [LocalManager::pollMgmtProcessServer] Error in read (errno: 104)
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} FATAL:  (last 
 system error 104: Connection reset by peer)
 Jul  2 14:03:09 l2 traffic_cop[25901]: cannot find traffic_server [1]
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} ERROR: 
 [LocalManager::sendMgmtMsgToProcesses] Error writing message
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} ERROR:  (last 
 system error 32: Broken pipe)
 Jul  2 14:03:12 l2 traffic_cop[25901]: cop received child status signal 
 [25826 35584]
 Jul  2 14:03:12 l2 traffic_cop[25901]: traffic_manager not running, making 
 sure traffic_server is dead
 Jul  2 14:03:12 l2 traffic_cop[25901]: spawning traffic_manager
 Jul  2 14:03:13 l2 traffic_manager[18267]: NOTE: --- Manager Starting ---
 Jul  2 14:03:13 l2 traffic_manager[18267]: NOTE: Manager Version: Apache 
 Traffic Server - traffic_manager - 3.2.0 - (build # 52518 on Jun 25 2012 at 
 18:22:12)
 Jul  2 14:03:13 l2 traffic_manager[18267]: {0x7fe63de3f7e0} STATUS: opened 
 /home/y/logs/trafficserver/manager.log
 Jul  2 14:03:15 l2 traffic_server[18322]: NOTE: --- Server Starting ---
 Jul  2 14:03:15 l2 traffic_server[18322]: NOTE: Server Version: Apache 
 Traffic Server - traffic_server - 3.2.0 - (build # 52518 on Jun 25 2012 at 
 18:22:31)
 Jul  2 14:03:15 l2 traffic_server[18322]: {0x2b77573ab860} STATUS: opened 
 /home/y/logs/trafficserver/diags.log
 Jul  2 14:03:15 l2 traffic_server[18322]: {0x2b77573ab860} ERROR: Cannot 
 insert duplicate!
 Jul  2 14:03:22 l2 traffic_cop[25901]: server heartbeat succeeded
 [Jul  2 13:56:56.304] Server {0x2b0a391e1700} ERROR: 
 [SSL_NetVConnection::ssl_read_from_net] SSL_ERROR_SYSCALL, underlying IO 
 error: Connection reset by peer
 NOTE: Traffic Server received Sig 11: Segmentation fault
 NOTE: Traffic Server received Sig 11: Segmentation fault
 /home/y/bin/traffic_server - STACK TRACE: 
 /home/y/bin/traffic_server - STACK TRACE: 
 /lib64/libpthread.so.0[0x3b54e0f4a0]
 /lib64/libpthread.so.0[0x3b54e0f4a0]
 /home/y/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e083]
 /home/y/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a64c8]
 /home/y/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a7e30]
 /home/y/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e083]
 /home/y/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a64c8]
 /home/y/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a7e30]
 /home/y/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f506]
 /home/y/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526c50]
 /home/y/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f506]
 /home/y/bin/traffic_server(_ZN6HttpSM9kill_thisEv+0x928)[0x52b548]
 /home/y/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526c50]
 /home/y/bin/traffic_server(_ZN6HttpSM9kill_thisEv+0x928)[0x52b548]
 /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0x198)[0x52b868]
 /home/y/bin/traffic_server(_ZN10HttpTunnel12main_handlerEiPv+0xde)[0x56c3ee]
 /home/y/bin/traffic_server[0x6736a1]
 /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0x198)[0x52b868]
 

[jira] [Commented] (TS-1330) Logging related segfault in 3.2.0

2013-07-25 Thread Leif Hedstrom (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13720381#comment-13720381
 ] 

Leif Hedstrom commented on TS-1330:
---

From the dupe bug (Yunkai's comment)

{code}

(gdb) bt
#0  0x005f0384 in LogBuffer::checkout_write (this=0x2aeb30002ff0, 
write_offset=0x2aeb09f5f4e8, write_size=152)
at LogBuffer.cc:287
#1  0x0060bc3f in LogObject::_checkout_write (this=0x1ed8800, 
write_offset=0x2aeb09f5f4e8, bytes_needed=152)
at LogObject.cc:424
#2  0x0060c43e in LogObject::log (this=0x1ed8800, lad=0x2aeb09f5f7a0, 
text_entry=0x0) at LogObject.cc:579
#3  0x005f761a in LogObjectManager::log (this=0x1eaf400, 
lad=0x2aeb09f5f7a0) at LogObject.h:395
#4  0x005f6379 in Log::access (lad=0x2aeb09f5f7a0) at Log.cc:1117
#5  0x0058506e in HttpSM::update_stats (this=0x2aeb41379f10) at 
HttpSM.cc:6312
#6  0x00584d21 in HttpSM::kill_this (this=0x2aeb41379f10) at 
HttpSM.cc:6273
#7  0x005798ea in HttpSM::main_handler (this=0x2aeb41379f10, 
event=2301, data=0x2aeb4137ba70) at HttpSM.cc:2593
#8  0x004f1f1e in Continuation::handleEvent (this=0x2aeb41379f10, 
event=2301, data=0x2aeb4137ba70)
at ../iocore/eventsystem/I_Continuation.h:146
#9  0x005bc819 in HttpTunnel::main_handler (this=0x2aeb4137ba70, 
event=103, data=0x2aebb000f050)
at HttpTunnel.cc:1476
#10 0x004f1f1e in Continuation::handleEvent (this=0x2aeb4137ba70, 
event=103, data=0x2aebb000f050)
at ../iocore/eventsystem/I_Continuation.h:146
#11 0x006ee000 in write_signal_and_update (event=103, 
vc=0x2aebb000eee0) at UnixNetVConnection.cc:153
#12 0x006ee0f9 in write_signal_done (event=103, nh=0x2aeb08d542a8, 
vc=0x2aebb000eee0) at UnixNetVConnection.cc:180
#13 0x006eef0a in write_to_net_io (nh=0x2aeb08d542a8, 
vc=0x2aebb000eee0, thread=0x2aeb08d51010)
at UnixNetVConnection.cc:479
#14 0x006ee8c9 in write_to_net (nh=0x2aeb08d542a8, vc=0x2aebb000eee0, 
pd=0x2aeb1466b010, thread=0x2aeb08d51010)
at UnixNetVConnection.cc:352
#15 0x006eae36 in NetHandler::mainNetEvent (this=0x2aeb08d542a8, 
event=5, e=0x2aeb0a6746e0) at UnixNet.cc:394
#16 0x004f1f1e in Continuation::handleEvent (this=0x2aeb08d542a8, 
event=5, data=0x2aeb0a6746e0)
at ../iocore/eventsystem/I_Continuation.h:146
#17 0x0070f3d2 in EThread::process_event (this=0x2aeb08d51010, 
e=0x2aeb0a6746e0, calling_code=5)
at UnixEThread.cc:142
#18 0x0070f8ef in EThread::execute (this=0x2aeb08d51010) at 
UnixEThread.cc:266
#19 0x0070e62f in spawn_thread_internal (a=0x1b3f520) at Thread.cc:88
#20 0x003c756077f1 in start_thread () from /lib64/libpthread.so.0
#21 0x003c752e570d in clone () from /lib64/libc.so.6
(gdb) f 0
#0  0x005f0384 in LogBuffer::checkout_write (this=0x2aeb30002ff0, 
write_offset=0x2aeb09f5f4e8, write_size=152)
at LogBuffer.cc:287
287 entry_header-timestamp = tp.tv_sec;
(gdb) p entry_header 
$7 = (LogEntryHeader *) 0x0
(gdb) p old_s
$8 = {ival = 0, s = {offset = 0, num_entries = 0, byte_count = 0, full = 0, 
num_writers = 0}}
(gdb) p new_s
$9 = {ival = 563671507992744, s = {offset = 168, num_entries = 1, byte_count = 
168, full = 0, num_writers = 1}}
(gdb) p m_buffer
$10 = 0x2aeb30005000 \316\372\316\n\002
(gdb) p *this
$11 = {write_link = {next = 0x0}, link = {SLinkLogBuffer = {next = 0x0}, 
prev = 0x0}, static M_ID = 57301638, 
  m_unaligned_buffer = 0x2aeb30004fa0 \350\a, m_buffer = 0x2aeb30005000 
\316\372\316\n\002, m_size = 18432, 
  m_buf_align = 512, m_write_align = 8, m_state = {ival = 923143098288024, s = 
{offset = 18328, num_entries = 115, 
  byte_count = 18328, full = 1, num_writers = 1}}, m_max_entries = 200, 
m_expiration_time = 1374692424, 
  m_owner = 0x1ed8800, m_header = 0x2aeb30005000, m_id = 57301630, m_references 
= 1}
{code}

 Logging related segfault in 3.2.0
 -

 Key: TS-1330
 URL: https://issues.apache.org/jira/browse/TS-1330
 Project: Traffic Server
  Issue Type: Bug
  Components: Logging
Affects Versions: 3.2.0
 Environment: ATS 3.2.0 on RHEL 6.2 64-bit
Reporter: David Carlin
Assignee: Bryan Call
Priority: Critical
  Labels: A, crash
 Fix For: 3.5.0


 I observed the following crash once on one of our ATS boxes - possibly 
 related to TS-1240
 Jul  2 13:56:56 l2 traffic_server[25853]: {0x2b0a391e1700} ERROR: 
 [SSL_NetVConnection::ssl_read_from_net] SSL_ERROR_SYSCALL, underlying IO 
 error: Connection reset by peer
 Jul  2 13:59:56 l2 kernel: [ET_NET 1][25855]: segfault at c ip 
 0058e083 sp 2b0a2982b740 error 6
 Jul  2 13:59:56 l2 kernel: [ET_NET 3][25857]: segfault at 84 ip 
 0058e083 sp 2b0a29a31740 error 6 in traffic_server[40+34]
 Jul  2 

[jira] [Commented] (TS-1330) Logging related segfault in 3.2.0

2013-07-17 Thread Leif Hedstrom (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13711821#comment-13711821
 ] 

Leif Hedstrom commented on TS-1330:
---

Bryan: Any updates? Should we just move this out to v3.3.6 (and hence, v3.5.0) ?

 Logging related segfault in 3.2.0
 -

 Key: TS-1330
 URL: https://issues.apache.org/jira/browse/TS-1330
 Project: Traffic Server
  Issue Type: Bug
  Components: Logging
Affects Versions: 3.2.0
 Environment: ATS 3.2.0 on RHEL 6.2 64-bit
Reporter: David Carlin
Assignee: Bryan Call
Priority: Critical
  Labels: A, crash
 Fix For: 3.3.5


 I observed the following crash once on one of our ATS boxes - possibly 
 related to TS-1240
 Jul  2 13:56:56 l2 traffic_server[25853]: {0x2b0a391e1700} ERROR: 
 [SSL_NetVConnection::ssl_read_from_net] SSL_ERROR_SYSCALL, underlying IO 
 error: Connection reset by peer
 Jul  2 13:59:56 l2 kernel: [ET_NET 1][25855]: segfault at c ip 
 0058e083 sp 2b0a2982b740 error 6
 Jul  2 13:59:56 l2 kernel: [ET_NET 3][25857]: segfault at 84 ip 
 0058e083 sp 2b0a29a31740 error 6 in traffic_server[40+34]
 Jul  2 13:59:56 l2 kernel: in traffic_server[40+34]
 Jul  2 14:02:59 l2 traffic_cop[25901]: (test) read timeout [18 ]
 Jul  2 14:02:59 l2 traffic_cop[25901]: server heartbeat failed [1]
 Jul  2 14:03:08 l2 traffic_manager[25826]: {0x7f3f088607e0} FATAL: 
 [LocalManager::pollMgmtProcessServer] Error in read (errno: 104)
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} FATAL:  (last 
 system error 104: Connection reset by peer)
 Jul  2 14:03:09 l2 traffic_cop[25901]: cannot find traffic_server [1]
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} ERROR: 
 [LocalManager::sendMgmtMsgToProcesses] Error writing message
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} ERROR:  (last 
 system error 32: Broken pipe)
 Jul  2 14:03:12 l2 traffic_cop[25901]: cop received child status signal 
 [25826 35584]
 Jul  2 14:03:12 l2 traffic_cop[25901]: traffic_manager not running, making 
 sure traffic_server is dead
 Jul  2 14:03:12 l2 traffic_cop[25901]: spawning traffic_manager
 Jul  2 14:03:13 l2 traffic_manager[18267]: NOTE: --- Manager Starting ---
 Jul  2 14:03:13 l2 traffic_manager[18267]: NOTE: Manager Version: Apache 
 Traffic Server - traffic_manager - 3.2.0 - (build # 52518 on Jun 25 2012 at 
 18:22:12)
 Jul  2 14:03:13 l2 traffic_manager[18267]: {0x7fe63de3f7e0} STATUS: opened 
 /home/y/logs/trafficserver/manager.log
 Jul  2 14:03:15 l2 traffic_server[18322]: NOTE: --- Server Starting ---
 Jul  2 14:03:15 l2 traffic_server[18322]: NOTE: Server Version: Apache 
 Traffic Server - traffic_server - 3.2.0 - (build # 52518 on Jun 25 2012 at 
 18:22:31)
 Jul  2 14:03:15 l2 traffic_server[18322]: {0x2b77573ab860} STATUS: opened 
 /home/y/logs/trafficserver/diags.log
 Jul  2 14:03:15 l2 traffic_server[18322]: {0x2b77573ab860} ERROR: Cannot 
 insert duplicate!
 Jul  2 14:03:22 l2 traffic_cop[25901]: server heartbeat succeeded
 [Jul  2 13:56:56.304] Server {0x2b0a391e1700} ERROR: 
 [SSL_NetVConnection::ssl_read_from_net] SSL_ERROR_SYSCALL, underlying IO 
 error: Connection reset by peer
 NOTE: Traffic Server received Sig 11: Segmentation fault
 NOTE: Traffic Server received Sig 11: Segmentation fault
 /home/y/bin/traffic_server - STACK TRACE: 
 /home/y/bin/traffic_server - STACK TRACE: 
 /lib64/libpthread.so.0[0x3b54e0f4a0]
 /lib64/libpthread.so.0[0x3b54e0f4a0]
 /home/y/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e083]
 /home/y/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a64c8]
 /home/y/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a7e30]
 /home/y/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e083]
 /home/y/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a64c8]
 /home/y/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a7e30]
 /home/y/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f506]
 /home/y/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526c50]
 /home/y/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f506]
 /home/y/bin/traffic_server(_ZN6HttpSM9kill_thisEv+0x928)[0x52b548]
 /home/y/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526c50]
 /home/y/bin/traffic_server(_ZN6HttpSM9kill_thisEv+0x928)[0x52b548]
 /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0x198)[0x52b868]
 /home/y/bin/traffic_server(_ZN10HttpTunnel12main_handlerEiPv+0xde)[0x56c3ee]
 /home/y/bin/traffic_server[0x6736a1]
 /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0x198)[0x52b868]
 /home/y/bin/traffic_server(_Z15write_to_net_ioP10NetHandlerP18UnixNetVConnectionP7EThread+0x847)[0x675517]
 

[jira] [Commented] (TS-1330) Logging related segfault in 3.2.0

2013-07-09 Thread Leif Hedstrom (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13703409#comment-13703409
 ] 

Leif Hedstrom commented on TS-1330:
---

Assigning to Bryan to make a decision to either fix it or move it out to v3.5.0.

 Logging related segfault in 3.2.0
 -

 Key: TS-1330
 URL: https://issues.apache.org/jira/browse/TS-1330
 Project: Traffic Server
  Issue Type: Bug
  Components: Logging
Affects Versions: 3.2.0
 Environment: ATS 3.2.0 on RHEL 6.2 64-bit
Reporter: David Carlin
Assignee: Leif Hedstrom
Priority: Critical
  Labels: A, crash
 Fix For: 3.3.5


 I observed the following crash once on one of our ATS boxes - possibly 
 related to TS-1240
 Jul  2 13:56:56 l2 traffic_server[25853]: {0x2b0a391e1700} ERROR: 
 [SSL_NetVConnection::ssl_read_from_net] SSL_ERROR_SYSCALL, underlying IO 
 error: Connection reset by peer
 Jul  2 13:59:56 l2 kernel: [ET_NET 1][25855]: segfault at c ip 
 0058e083 sp 2b0a2982b740 error 6
 Jul  2 13:59:56 l2 kernel: [ET_NET 3][25857]: segfault at 84 ip 
 0058e083 sp 2b0a29a31740 error 6 in traffic_server[40+34]
 Jul  2 13:59:56 l2 kernel: in traffic_server[40+34]
 Jul  2 14:02:59 l2 traffic_cop[25901]: (test) read timeout [18 ]
 Jul  2 14:02:59 l2 traffic_cop[25901]: server heartbeat failed [1]
 Jul  2 14:03:08 l2 traffic_manager[25826]: {0x7f3f088607e0} FATAL: 
 [LocalManager::pollMgmtProcessServer] Error in read (errno: 104)
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} FATAL:  (last 
 system error 104: Connection reset by peer)
 Jul  2 14:03:09 l2 traffic_cop[25901]: cannot find traffic_server [1]
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} ERROR: 
 [LocalManager::sendMgmtMsgToProcesses] Error writing message
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} ERROR:  (last 
 system error 32: Broken pipe)
 Jul  2 14:03:12 l2 traffic_cop[25901]: cop received child status signal 
 [25826 35584]
 Jul  2 14:03:12 l2 traffic_cop[25901]: traffic_manager not running, making 
 sure traffic_server is dead
 Jul  2 14:03:12 l2 traffic_cop[25901]: spawning traffic_manager
 Jul  2 14:03:13 l2 traffic_manager[18267]: NOTE: --- Manager Starting ---
 Jul  2 14:03:13 l2 traffic_manager[18267]: NOTE: Manager Version: Apache 
 Traffic Server - traffic_manager - 3.2.0 - (build # 52518 on Jun 25 2012 at 
 18:22:12)
 Jul  2 14:03:13 l2 traffic_manager[18267]: {0x7fe63de3f7e0} STATUS: opened 
 /home/y/logs/trafficserver/manager.log
 Jul  2 14:03:15 l2 traffic_server[18322]: NOTE: --- Server Starting ---
 Jul  2 14:03:15 l2 traffic_server[18322]: NOTE: Server Version: Apache 
 Traffic Server - traffic_server - 3.2.0 - (build # 52518 on Jun 25 2012 at 
 18:22:31)
 Jul  2 14:03:15 l2 traffic_server[18322]: {0x2b77573ab860} STATUS: opened 
 /home/y/logs/trafficserver/diags.log
 Jul  2 14:03:15 l2 traffic_server[18322]: {0x2b77573ab860} ERROR: Cannot 
 insert duplicate!
 Jul  2 14:03:22 l2 traffic_cop[25901]: server heartbeat succeeded
 [Jul  2 13:56:56.304] Server {0x2b0a391e1700} ERROR: 
 [SSL_NetVConnection::ssl_read_from_net] SSL_ERROR_SYSCALL, underlying IO 
 error: Connection reset by peer
 NOTE: Traffic Server received Sig 11: Segmentation fault
 NOTE: Traffic Server received Sig 11: Segmentation fault
 /home/y/bin/traffic_server - STACK TRACE: 
 /home/y/bin/traffic_server - STACK TRACE: 
 /lib64/libpthread.so.0[0x3b54e0f4a0]
 /lib64/libpthread.so.0[0x3b54e0f4a0]
 /home/y/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e083]
 /home/y/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a64c8]
 /home/y/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a7e30]
 /home/y/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e083]
 /home/y/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a64c8]
 /home/y/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a7e30]
 /home/y/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f506]
 /home/y/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526c50]
 /home/y/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f506]
 /home/y/bin/traffic_server(_ZN6HttpSM9kill_thisEv+0x928)[0x52b548]
 /home/y/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526c50]
 /home/y/bin/traffic_server(_ZN6HttpSM9kill_thisEv+0x928)[0x52b548]
 /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0x198)[0x52b868]
 /home/y/bin/traffic_server(_ZN10HttpTunnel12main_handlerEiPv+0xde)[0x56c3ee]
 /home/y/bin/traffic_server[0x6736a1]
 /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0x198)[0x52b868]
 /home/y/bin/traffic_server(_Z15write_to_net_ioP10NetHandlerP18UnixNetVConnectionP7EThread+0x847)[0x675517]
 

[jira] [Commented] (TS-1330) Logging related segfault in 3.2.0

2013-06-28 Thread Bryan Call (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13695914#comment-13695914
 ] 

Bryan Call commented on TS-1330:


Looking at the core dumps on a production box:

{code}
[bcall@l2 ~]$ sudo grep -A 10 'STACK TRACE' 
/usr/local/logs/trafficserver/traffic.out  | grep -v ^\\[
/usr/local/bin/traffic_server - STACK TRACE:
--
/usr/local/bin/traffic_server - STACK TRACE:
/lib64/libpthread.so.0[0x342f00f500]
/usr/local/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e3e3]
/usr/local/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a6828]
/usr/local/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a8190]
/usr/local/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f866]
/usr/local/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526fa0]
/usr/local/bin/traffic_server(_ZN6HttpSM9kill_thisEv+0x928)[0x52b898]
/usr/local/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0x198)[0x52bbb8]
/usr/local/bin/traffic_server(_ZN10HttpTunnel12main_handlerEiPv+0xde)[0x56c74e]
/usr/local/bin/traffic_server[0x6743c1]
--
/usr/local/bin/traffic_server/usr/local/bin/traffic_server - STACK TRACE:
/usr/local/bin/traffic_server - STACK TRACE:
 - STACK TRACE:
/usr/local/bin/traffic_server/usr/local/bin/traffic_server - STACK TRACE:
 - STACK TRACE:
/lib64/libpthread.so.0[0x342f00f500]
/usr/local/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e3e3]
/usr/local/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a6828]
/usr/local/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a8190]
/lib64/libpthread.so.0[0x342f00f500]
/usr/local/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e3e3]
/lib64/libpthread.so.0[0x342f00f500]
/usr/local/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e3e3]
/usr/local/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a6828]
/usr/local/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a8190]
--
/usr/local/bin/traffic_server - STACK TRACE:
/lib64/libpthread.so.0[0x342f00f500]
/usr/local/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e3e3]
/usr/local/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a6828]
/usr/local/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a8190]
/usr/local/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f866]
/usr/local/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526fa0]
/usr/local/bin/traffic_server(_ZN6HttpSM9kill_thisEv+0x928)[0x52b898]
/usr/local/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0x198)[0x52bbb8]
/usr/local/bin/traffic_server[0x6735eb]
/usr/local/bin/traffic_server(_ZN18UnixNetVConnection9mainEventEiP5Event+0x5df)[0x67577f]
--
/usr/local/bin/traffic_server - STACK TRACE:
/lib64/libpthread.so.0[0x342f00f500]
/usr/local/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e3e3]
/usr/local/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a6828]
/usr/local/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a8190]
/usr/local/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f866]
/usr/local/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526fa0]
/usr/local/bin/traffic_server(_ZN6HttpSM9kill_thisEv+0x928)[0x52b898]
/usr/local/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0x198)[0x52bbb8]
/usr/local/bin/traffic_server[0x673ca1]
/usr/local/bin/traffic_server[0x6777f5]
--
/usr/local/bin/traffic_server - STACK TRACE:


--
/usr/local/bin/traffic_server - STACK TRACE:
/lib64/libpthread.so.0[0x342f00f500]
/usr/local/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e3e3]
/usr/local/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a6828]
/usr/local/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a8190]
/usr/local/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f866]
/usr/local/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526fa0]
/usr/local/bin/traffic_server(_ZN6HttpSM9kill_thisEv+0x928)[0x52b898]
/usr/local/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0x198)[0x52bbb8]
/usr/local/bin/traffic_server(_ZN10HttpTunnel12main_handlerEiPv+0xde)[0x56c74e]
/usr/local/bin/traffic_server[0x6743c1]
--
/usr/local/bin/traffic_server - STACK TRACE:
/lib64/libpthread.so.0[0x342f00f500]
/usr/local/lib64/libtsutil.so.3(ink_atomiclist_popall+0x64)[0x2ad678deedc4]
/usr/local/bin/traffic_server(_ZN16LogBufferManager13flush_buffersEP13LogBufferSink+0x1f)[0x5a734f]
/usr/local/bin/traffic_server(_ZN16LogObjectManager13flush_buffersEv+0x28)[0x5a7e28]
/usr/local/bin/traffic_server(_ZN3Log17flush_thread_mainEPv+0xaa)[0x59012a]
/usr/local/bin/traffic_server(_ZN24LoggingFlushContinuation9mainEventEiPv+0xb)[0x59361b]
/usr/local/bin/traffic_server(_ZN7EThread7executeEv+0xb86)[0x698886]
/usr/local/bin/traffic_server[0x696802]
/lib64/libpthread.so.0[0x342f007851]

[jira] [Commented] (TS-1330) Logging related segfault in 3.2.0

2012-10-23 Thread Peter (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13482935#comment-13482935
 ] 

Peter commented on TS-1330:
---

I'm seeing something similar in ATS/3.0.5

[2012-10-23 18:10:41]  [ET_SSL 4][24739]: segfault at 8a ip 00639afd sp 
2ab043ca6d90 error 6 in traffic_server[40+31d000]
[2012-10-23 18:21:06]  [ET_SSL 1][7225]: segfault at 8a ip 00639afd sp 
2b8def9a3d90 error 6 in traffic_server[40+31d000]
[2012-10-23 19:02:02]  [ET_SSL 0][7333]: segfault at 8a ip 00639afd sp 
2b23e40ffd90 error 6 in traffic_server[40+31d000]
[2012-10-23 19:02:11]  [ET_SSL 3][7677]: segfault at 8a ip 00639afd sp 
2b4ab3ba5d90 error 6 in traffic_server[40+31d000]

I'll try to get a core asap

 Logging related segfault in 3.2.0
 -

 Key: TS-1330
 URL: https://issues.apache.org/jira/browse/TS-1330
 Project: Traffic Server
  Issue Type: Bug
  Components: Logging
Affects Versions: 3.2.0
 Environment: ATS 3.2.0 on RHEL 6.2 64-bit
Reporter: David Carlin
 Fix For: 3.3.1


 I observed the following crash once on one of our ATS boxes - possibly 
 related to TS-1240
 Jul  2 13:56:56 l2 traffic_server[25853]: {0x2b0a391e1700} ERROR: 
 [SSL_NetVConnection::ssl_read_from_net] SSL_ERROR_SYSCALL, underlying IO 
 error: Connection reset by peer
 Jul  2 13:59:56 l2 kernel: [ET_NET 1][25855]: segfault at c ip 
 0058e083 sp 2b0a2982b740 error 6
 Jul  2 13:59:56 l2 kernel: [ET_NET 3][25857]: segfault at 84 ip 
 0058e083 sp 2b0a29a31740 error 6 in traffic_server[40+34]
 Jul  2 13:59:56 l2 kernel: in traffic_server[40+34]
 Jul  2 14:02:59 l2 traffic_cop[25901]: (test) read timeout [18 ]
 Jul  2 14:02:59 l2 traffic_cop[25901]: server heartbeat failed [1]
 Jul  2 14:03:08 l2 traffic_manager[25826]: {0x7f3f088607e0} FATAL: 
 [LocalManager::pollMgmtProcessServer] Error in read (errno: 104)
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} FATAL:  (last 
 system error 104: Connection reset by peer)
 Jul  2 14:03:09 l2 traffic_cop[25901]: cannot find traffic_server [1]
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} ERROR: 
 [LocalManager::sendMgmtMsgToProcesses] Error writing message
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} ERROR:  (last 
 system error 32: Broken pipe)
 Jul  2 14:03:12 l2 traffic_cop[25901]: cop received child status signal 
 [25826 35584]
 Jul  2 14:03:12 l2 traffic_cop[25901]: traffic_manager not running, making 
 sure traffic_server is dead
 Jul  2 14:03:12 l2 traffic_cop[25901]: spawning traffic_manager
 Jul  2 14:03:13 l2 traffic_manager[18267]: NOTE: --- Manager Starting ---
 Jul  2 14:03:13 l2 traffic_manager[18267]: NOTE: Manager Version: Apache 
 Traffic Server - traffic_manager - 3.2.0 - (build # 52518 on Jun 25 2012 at 
 18:22:12)
 Jul  2 14:03:13 l2 traffic_manager[18267]: {0x7fe63de3f7e0} STATUS: opened 
 /home/y/logs/trafficserver/manager.log
 Jul  2 14:03:15 l2 traffic_server[18322]: NOTE: --- Server Starting ---
 Jul  2 14:03:15 l2 traffic_server[18322]: NOTE: Server Version: Apache 
 Traffic Server - traffic_server - 3.2.0 - (build # 52518 on Jun 25 2012 at 
 18:22:31)
 Jul  2 14:03:15 l2 traffic_server[18322]: {0x2b77573ab860} STATUS: opened 
 /home/y/logs/trafficserver/diags.log
 Jul  2 14:03:15 l2 traffic_server[18322]: {0x2b77573ab860} ERROR: Cannot 
 insert duplicate!
 Jul  2 14:03:22 l2 traffic_cop[25901]: server heartbeat succeeded
 [Jul  2 13:56:56.304] Server {0x2b0a391e1700} ERROR: 
 [SSL_NetVConnection::ssl_read_from_net] SSL_ERROR_SYSCALL, underlying IO 
 error: Connection reset by peer
 NOTE: Traffic Server received Sig 11: Segmentation fault
 NOTE: Traffic Server received Sig 11: Segmentation fault
 /home/y/bin/traffic_server - STACK TRACE: 
 /home/y/bin/traffic_server - STACK TRACE: 
 /lib64/libpthread.so.0[0x3b54e0f4a0]
 /lib64/libpthread.so.0[0x3b54e0f4a0]
 /home/y/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e083]
 /home/y/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a64c8]
 /home/y/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a7e30]
 /home/y/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e083]
 /home/y/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a64c8]
 /home/y/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a7e30]
 /home/y/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f506]
 /home/y/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526c50]
 /home/y/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f506]
 /home/y/bin/traffic_server(_ZN6HttpSM9kill_thisEv+0x928)[0x52b548]
 /home/y/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526c50]
 

[jira] [Commented] (TS-1330) Logging related segfault in 3.2.0

2012-08-17 Thread David Carlin (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436984#comment-13436984
 ] 

David Carlin commented on TS-1330:
--

Still seeing this on a few different boxes occasionally:

NOTE: Traffic Server received Sig 11: Segmentation fault
/home/y/bin/traffic_server - STACK TRACE: 
/lib64/libpthread.so.0[0x35ea80f500]
/home/y/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e083]
/home/y/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a64c8]
/home/y/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a7e30]
/home/y/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f506]
/home/y/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526c50]
/home/y/bin/traffic_server(_ZN6HttpSM9kill_thisEv+0x928)[0x52b548]
/home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0x198)[0x52b868]
/home/y/bin/traffic_server(_ZN10HttpTunnel12main_handlerEiPv+0xde)[0x56c3ee]
/home/y/bin/traffic_server[0x673871]
/home/y/bin/traffic_server(_Z15write_to_net_ioP10NetHandlerP18UnixNetVConnectionP7EThread+0x847)[0x6756e7]
/home/y/bin/traffic_server(_ZN10NetHandler12mainNetEventEiP5Event+0x286)[0x66e076]
/home/y/bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0xb4)[0x696ce4]
/home/y/bin/traffic_server(_ZN7EThread7executeEv+0x4c3)[0x697673]
/home/y/bin/traffic_server[0x695cb2]
/lib64/libpthread.so.0[0x35ea807851]
/lib64/libc.so.6(clone+0x6d)[0x35ea4e767d]

NOTE: Traffic Server received Sig 11: Segmentation fault
/home/y/bin/traffic_server - STACK TRACE: 
/lib64/libpthread.so.0[0x37d080f500]
/home/y/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e083]
/home/y/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a64c8]
/home/y/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a7e30]
/home/y/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f506]
/home/y/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526c50]
/home/y/bin/traffic_server(_ZN6HttpSM9kill_thisEv+0x928)[0x52b548]
/home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0x198)[0x52b868]
/home/y/bin/traffic_server(_ZN10HttpTunnel12main_handlerEiPv+0xde)[0x56c3ee]
/home/y/bin/traffic_server[0x673871]
/home/y/bin/traffic_server(_Z15write_to_net_ioP10NetHandlerP18UnixNetVConnectionP7EThread+0x847)[0x6756e7]
/home/y/bin/traffic_server(_ZN10NetHandler12mainNetEventEiP5Event+0x286)[0x66e076]
/home/y/bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0xb4)[0x696ce4]
/home/y/bin/traffic_server(_ZN7EThread7executeEv+0x4c3)[0x697673]
/home/y/bin/traffic_server[0x695cb2]
/lib64/libpthread.so.0[0x37d0807851]
/lib64/libc.so.6(clone+0x6d)[0x37d04e767d]


NOTE: Traffic Server received Sig 11: Segmentation fault
NOTE: Traffic Server received Sig 11: Segmentation fault
/home/y/bin/traffic_server - STACK TRACE: 
/home/y/bin/traffic_server - STACK TRACE: 
/lib64/libpthread.so.0[0x3ecd40f500]
/lib64/libpthread.so.0[0x3ecd40f500]
/home/y/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e083]
/home/y/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a64c8]
/home/y/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a7e30]
/home/y/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e083]
/home/y/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a64c8]
/home/y/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f506]
/home/y/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a7e30]
/home/y/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526c50]
/home/y/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f506]
/home/y/bin/traffic_server(_ZN6HttpSM9kill_thisEv+0x928)[0x52b548]
/home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0x198)[0x52b868]
/home/y/bin/traffic_server(_ZN10HttpTunnel12main_handlerEiPv+0xde)[0x56c3ee]
/home/y/bin/traffic_server[0x673871]
/home/y/bin/traffic_server(_Z15write_to_net_ioP10NetHandlerP18UnixNetVConnectionP7EThread+0x847)[0x6756e7]
/home/y/bin/traffic_server(_ZN10NetHandler12mainNetEventEiP5Event+0x286)[0x66e076]
/home/y/bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0xb4)[0x696ce4]
/home/y/bin/traffic_server(_ZN7EThread7executeEv+0x4c3)[0x697673]
/home/y/bin/traffic_server[0x695cb2]
/lib64/libpthread.so.0[0x3ecd407851]
/lib64/libc.so.6(clone+0x6d)[0x3ecd0e767d]
/home/y/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526c50]
/home/y/bin/traffic_server(_ZN6HttpSM9kill_thisEv+0x928)[0x52b548]
/home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0x198)[0x52b868]
/home/y/bin/traffic_server[0x673151]
/home/y/bin/traffic_server[0x676ca5]
/home/y/bin/traffic_server(_ZN10NetHandler12mainNetEventEiP5Event+0x1f2)[0x66dfe2]
/home/y/bin/traffic_server(_ZN7EThread13process_eventEP5Eventi+0xb4)[0x696ce4]
/home/y/bin/traffic_server(_ZN7EThread7executeEv+0x4c3)[0x697673]
/home/y/bin/traffic_server(main+0xea4)[0x4c26a4]

[jira] [Commented] (TS-1330) Logging related segfault in 3.2.0

2012-08-17 Thread Leif Hedstrom (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13437009#comment-13437009
 ] 

Leif Hedstrom commented on TS-1330:
---

Any chance you can produce some more detailed backtraces and information, from 
the core file? Bryan can probably help you if you need. The stacktraces from 
the logs are much less useful than the data from the core files.

 Logging related segfault in 3.2.0
 -

 Key: TS-1330
 URL: https://issues.apache.org/jira/browse/TS-1330
 Project: Traffic Server
  Issue Type: Bug
  Components: Logging
Affects Versions: 3.2.0
 Environment: ATS 3.2.0 on RHEL 6.2 64-bit
Reporter: David Carlin
 Fix For: 3.3.1


 I observed the following crash once on one of our ATS boxes - possibly 
 related to TS-1240
 Jul  2 13:56:56 l2 traffic_server[25853]: {0x2b0a391e1700} ERROR: 
 [SSL_NetVConnection::ssl_read_from_net] SSL_ERROR_SYSCALL, underlying IO 
 error: Connection reset by peer
 Jul  2 13:59:56 l2 kernel: [ET_NET 1][25855]: segfault at c ip 
 0058e083 sp 2b0a2982b740 error 6
 Jul  2 13:59:56 l2 kernel: [ET_NET 3][25857]: segfault at 84 ip 
 0058e083 sp 2b0a29a31740 error 6 in traffic_server[40+34]
 Jul  2 13:59:56 l2 kernel: in traffic_server[40+34]
 Jul  2 14:02:59 l2 traffic_cop[25901]: (test) read timeout [18 ]
 Jul  2 14:02:59 l2 traffic_cop[25901]: server heartbeat failed [1]
 Jul  2 14:03:08 l2 traffic_manager[25826]: {0x7f3f088607e0} FATAL: 
 [LocalManager::pollMgmtProcessServer] Error in read (errno: 104)
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} FATAL:  (last 
 system error 104: Connection reset by peer)
 Jul  2 14:03:09 l2 traffic_cop[25901]: cannot find traffic_server [1]
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} ERROR: 
 [LocalManager::sendMgmtMsgToProcesses] Error writing message
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} ERROR:  (last 
 system error 32: Broken pipe)
 Jul  2 14:03:12 l2 traffic_cop[25901]: cop received child status signal 
 [25826 35584]
 Jul  2 14:03:12 l2 traffic_cop[25901]: traffic_manager not running, making 
 sure traffic_server is dead
 Jul  2 14:03:12 l2 traffic_cop[25901]: spawning traffic_manager
 Jul  2 14:03:13 l2 traffic_manager[18267]: NOTE: --- Manager Starting ---
 Jul  2 14:03:13 l2 traffic_manager[18267]: NOTE: Manager Version: Apache 
 Traffic Server - traffic_manager - 3.2.0 - (build # 52518 on Jun 25 2012 at 
 18:22:12)
 Jul  2 14:03:13 l2 traffic_manager[18267]: {0x7fe63de3f7e0} STATUS: opened 
 /home/y/logs/trafficserver/manager.log
 Jul  2 14:03:15 l2 traffic_server[18322]: NOTE: --- Server Starting ---
 Jul  2 14:03:15 l2 traffic_server[18322]: NOTE: Server Version: Apache 
 Traffic Server - traffic_server - 3.2.0 - (build # 52518 on Jun 25 2012 at 
 18:22:31)
 Jul  2 14:03:15 l2 traffic_server[18322]: {0x2b77573ab860} STATUS: opened 
 /home/y/logs/trafficserver/diags.log
 Jul  2 14:03:15 l2 traffic_server[18322]: {0x2b77573ab860} ERROR: Cannot 
 insert duplicate!
 Jul  2 14:03:22 l2 traffic_cop[25901]: server heartbeat succeeded
 [Jul  2 13:56:56.304] Server {0x2b0a391e1700} ERROR: 
 [SSL_NetVConnection::ssl_read_from_net] SSL_ERROR_SYSCALL, underlying IO 
 error: Connection reset by peer
 NOTE: Traffic Server received Sig 11: Segmentation fault
 NOTE: Traffic Server received Sig 11: Segmentation fault
 /home/y/bin/traffic_server - STACK TRACE: 
 /home/y/bin/traffic_server - STACK TRACE: 
 /lib64/libpthread.so.0[0x3b54e0f4a0]
 /lib64/libpthread.so.0[0x3b54e0f4a0]
 /home/y/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e083]
 /home/y/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a64c8]
 /home/y/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a7e30]
 /home/y/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e083]
 /home/y/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a64c8]
 /home/y/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a7e30]
 /home/y/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f506]
 /home/y/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526c50]
 /home/y/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f506]
 /home/y/bin/traffic_server(_ZN6HttpSM9kill_thisEv+0x928)[0x52b548]
 /home/y/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526c50]
 /home/y/bin/traffic_server(_ZN6HttpSM9kill_thisEv+0x928)[0x52b548]
 /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0x198)[0x52b868]
 /home/y/bin/traffic_server(_ZN10HttpTunnel12main_handlerEiPv+0xde)[0x56c3ee]
 /home/y/bin/traffic_server[0x6736a1]
 /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0x198)[0x52b868]
 

[jira] [Commented] (TS-1330) Logging related segfault in 3.2.0

2012-08-17 Thread David Carlin (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13437112#comment-13437112
 ] 

David Carlin commented on TS-1330:
--

I'm not able to find core dumps for this issue.  Whats in traffic.out is all I 
have.  There are core dumps on these boxes for other crashes, but for some 
reason these events don't generate one.

 Logging related segfault in 3.2.0
 -

 Key: TS-1330
 URL: https://issues.apache.org/jira/browse/TS-1330
 Project: Traffic Server
  Issue Type: Bug
  Components: Logging
Affects Versions: 3.2.0
 Environment: ATS 3.2.0 on RHEL 6.2 64-bit
Reporter: David Carlin
 Fix For: 3.3.1


 I observed the following crash once on one of our ATS boxes - possibly 
 related to TS-1240
 Jul  2 13:56:56 l2 traffic_server[25853]: {0x2b0a391e1700} ERROR: 
 [SSL_NetVConnection::ssl_read_from_net] SSL_ERROR_SYSCALL, underlying IO 
 error: Connection reset by peer
 Jul  2 13:59:56 l2 kernel: [ET_NET 1][25855]: segfault at c ip 
 0058e083 sp 2b0a2982b740 error 6
 Jul  2 13:59:56 l2 kernel: [ET_NET 3][25857]: segfault at 84 ip 
 0058e083 sp 2b0a29a31740 error 6 in traffic_server[40+34]
 Jul  2 13:59:56 l2 kernel: in traffic_server[40+34]
 Jul  2 14:02:59 l2 traffic_cop[25901]: (test) read timeout [18 ]
 Jul  2 14:02:59 l2 traffic_cop[25901]: server heartbeat failed [1]
 Jul  2 14:03:08 l2 traffic_manager[25826]: {0x7f3f088607e0} FATAL: 
 [LocalManager::pollMgmtProcessServer] Error in read (errno: 104)
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} FATAL:  (last 
 system error 104: Connection reset by peer)
 Jul  2 14:03:09 l2 traffic_cop[25901]: cannot find traffic_server [1]
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} ERROR: 
 [LocalManager::sendMgmtMsgToProcesses] Error writing message
 Jul  2 14:03:09 l2 traffic_manager[25826]: {0x7f3f088607e0} ERROR:  (last 
 system error 32: Broken pipe)
 Jul  2 14:03:12 l2 traffic_cop[25901]: cop received child status signal 
 [25826 35584]
 Jul  2 14:03:12 l2 traffic_cop[25901]: traffic_manager not running, making 
 sure traffic_server is dead
 Jul  2 14:03:12 l2 traffic_cop[25901]: spawning traffic_manager
 Jul  2 14:03:13 l2 traffic_manager[18267]: NOTE: --- Manager Starting ---
 Jul  2 14:03:13 l2 traffic_manager[18267]: NOTE: Manager Version: Apache 
 Traffic Server - traffic_manager - 3.2.0 - (build # 52518 on Jun 25 2012 at 
 18:22:12)
 Jul  2 14:03:13 l2 traffic_manager[18267]: {0x7fe63de3f7e0} STATUS: opened 
 /home/y/logs/trafficserver/manager.log
 Jul  2 14:03:15 l2 traffic_server[18322]: NOTE: --- Server Starting ---
 Jul  2 14:03:15 l2 traffic_server[18322]: NOTE: Server Version: Apache 
 Traffic Server - traffic_server - 3.2.0 - (build # 52518 on Jun 25 2012 at 
 18:22:31)
 Jul  2 14:03:15 l2 traffic_server[18322]: {0x2b77573ab860} STATUS: opened 
 /home/y/logs/trafficserver/diags.log
 Jul  2 14:03:15 l2 traffic_server[18322]: {0x2b77573ab860} ERROR: Cannot 
 insert duplicate!
 Jul  2 14:03:22 l2 traffic_cop[25901]: server heartbeat succeeded
 [Jul  2 13:56:56.304] Server {0x2b0a391e1700} ERROR: 
 [SSL_NetVConnection::ssl_read_from_net] SSL_ERROR_SYSCALL, underlying IO 
 error: Connection reset by peer
 NOTE: Traffic Server received Sig 11: Segmentation fault
 NOTE: Traffic Server received Sig 11: Segmentation fault
 /home/y/bin/traffic_server - STACK TRACE: 
 /home/y/bin/traffic_server - STACK TRACE: 
 /lib64/libpthread.so.0[0x3b54e0f4a0]
 /lib64/libpthread.so.0[0x3b54e0f4a0]
 /home/y/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e083]
 /home/y/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a64c8]
 /home/y/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a7e30]
 /home/y/bin/traffic_server(_ZN9LogBuffer14checkout_writeEPmm+0x153)[0x58e083]
 /home/y/bin/traffic_server(_ZN9LogObject15_checkout_writeEPmm+0xa8)[0x5a64c8]
 /home/y/bin/traffic_server(_ZN9LogObject3logEP9LogAccessPc+0x2f0)[0x5a7e30]
 /home/y/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f506]
 /home/y/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526c50]
 /home/y/bin/traffic_server(_ZN3Log6accessEP9LogAccess+0x146)[0x58f506]
 /home/y/bin/traffic_server(_ZN6HttpSM9kill_thisEv+0x928)[0x52b548]
 /home/y/bin/traffic_server(_ZN6HttpSM12update_statsEv+0x630)[0x526c50]
 /home/y/bin/traffic_server(_ZN6HttpSM9kill_thisEv+0x928)[0x52b548]
 /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0x198)[0x52b868]
 /home/y/bin/traffic_server(_ZN10HttpTunnel12main_handlerEiPv+0xde)[0x56c3ee]
 /home/y/bin/traffic_server[0x6736a1]
 /home/y/bin/traffic_server(_ZN6HttpSM12main_handlerEiPv+0x198)[0x52b868]
 /home/y/bin/traffic_server(_Z15write_to_net_ioP10NetHandlerP18UnixNetVConnectionP7EThread+0x847)[0x675517]