[jira] [Commented] (TS-1114) Crash report: HttpTransactCache::SelectFromAlternates

2012-03-18 Thread Zhao Yongming (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/TS-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232196#comment-13232196
 ] 

Zhao Yongming commented on TS-1114:
---

when we tracking down this issue, we have two directions: 
Weijin is tracking on why the event is 8, where there should not be any event 
that is 8 in the event system, and in other core dumps we are sure that the 
event is not what it should be as a really event, it is shown as a random data, 
that turns out to be something really interest: 1, it should be that the old 
data(may  or may not be the same event) is freed, and the event is not 
canceled. 2, someone overwrite the data in this event. Weijin track down this 
way and it turns out that the action cancel codes may rise some problem under 
certain situation. He made a patch into our tree, and we applied it on half of 
our servers, it runs without any crash for weeks.

At the same time, Koutai is working on make the vector write  read more safe, 
even in some very strange situation. And patched half of our servers, runs 
without any crash too.

after carefully discuss, we conclude that Weijing's patch is what we need to 
keep, and here comes the patch.

back to TS-857, when I look it back, there is some strange event in the back 
trace, we have only , is that the same issue hare? where is the action canceled 
without mutex protected? if we can consider TS-1114 a good fix, then we should 
think about TS-857 a crash same as it.

so far, I am not sure how many crashes after patched with TS-1114, I just don't 
get too much new back trace for this issue, TS-1114 may covered many strange 
crashes as it will make system really strange.

 Crash report: HttpTransactCache::SelectFromAlternates
 -

 Key: TS-1114
 URL: https://issues.apache.org/jira/browse/TS-1114
 Project: Traffic Server
  Issue Type: Bug
Reporter: Zhao Yongming
Assignee: weijin
 Fix For: 3.1.4

 Attachments: cache_crash.diff


 it may or may not be the upstream issue, let us open it for tracking.
 {code}
 #0  0x0053075e in HttpTransactCache::SelectFromAlternates 
 (cache_vector=0x2aaab80ff500, client_request=0x2aaab80ff4c0, 
 http_config_params=0x2aaab547b800) at ../../proxy/hdrs/HTTP.h:1375
 1375((int32_t *)  val)[0] = m_alt-m_object_key[0];
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-1148) support systemd init system

2012-03-18 Thread Jan-Frode Myklebust (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan-Frode Myklebust updated TS-1148:


Attachment: 0001-TS-1148-Support-systemd-activation-of-ATS.patch

Suggested systemd service file.

 support systemd init system
 ---

 Key: TS-1148
 URL: https://issues.apache.org/jira/browse/TS-1148
 Project: Traffic Server
  Issue Type: Improvement
  Components: Packaging
Reporter: Jan-Frode Myklebust
 Fix For: 3.0.3

 Attachments: 0001-TS-1148-Support-systemd-activation-of-ATS.patch


 ATS should support systemd init system for managing the trafficserver service.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (TS-1149) pretty up automate output

2012-03-18 Thread James Peach (Created) (JIRA)
pretty up automate output
-

 Key: TS-1149
 URL: https://issues.apache.org/jira/browse/TS-1149
 Project: Traffic Server
  Issue Type: Improvement
  Components: Build
Reporter: James Peach
Assignee: James Peach
Priority: Trivial


automake is super ugly by default. Make it pretty.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (TS-1150) Improve on some header handling functionality

2012-03-18 Thread Leif Hedstrom (Created) (JIRA)
Improve on some header handling functionality
-

 Key: TS-1150
 URL: https://issues.apache.org/jira/browse/TS-1150
 Project: Traffic Server
  Issue Type: Bug
  Components: Core
Reporter: Leif Hedstrom
Assignee: Leif Hedstrom
 Fix For: 3.1.4


There are a few performance and functionality improvements we can do around 
header handling.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-1148) support systemd init system

2012-03-18 Thread Jan-Frode Myklebust (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan-Frode Myklebust updated TS-1148:


Attachment: 0002-TS-1148-Support-systemd-activation-of-ATS.patch

Updated patch, moving the trafficserver.service to rc/ and uses automake string 
substitutions.

 support systemd init system
 ---

 Key: TS-1148
 URL: https://issues.apache.org/jira/browse/TS-1148
 Project: Traffic Server
  Issue Type: Improvement
  Components: Packaging
Reporter: Jan-Frode Myklebust
 Fix For: 3.0.3

 Attachments: 0001-TS-1148-Support-systemd-activation-of-ATS.patch, 
 0002-TS-1148-Support-systemd-activation-of-ATS.patch


 ATS should support systemd init system for managing the trafficserver service.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-1149) pretty up automake output

2012-03-18 Thread James Peach (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Peach updated TS-1149:


Summary: pretty up automake output  (was: pretty up automate output)

 pretty up automake output
 -

 Key: TS-1149
 URL: https://issues.apache.org/jira/browse/TS-1149
 Project: Traffic Server
  Issue Type: Improvement
  Components: Build
Reporter: James Peach
Assignee: James Peach
Priority: Trivial

 automake is super ugly by default. Make it pretty.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (TS-1149) pretty up automake output

2012-03-18 Thread James Peach (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Peach resolved TS-1149.
-

   Resolution: Fixed
Fix Version/s: 3.1.4

633f34d18a81e31542d36886420f8478eebfa0b2 TS-1149: Pretty up automake output


 pretty up automake output
 -

 Key: TS-1149
 URL: https://issues.apache.org/jira/browse/TS-1149
 Project: Traffic Server
  Issue Type: Improvement
  Components: Build
Reporter: James Peach
Assignee: James Peach
Priority: Trivial
 Fix For: 3.1.4


 automake is super ugly by default. Make it pretty.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (TS-1151) in some strange situation, cop will crash

2012-03-18 Thread Zhao Yongming (Created) (JIRA)
in some strange situation, cop will crash
-

 Key: TS-1151
 URL: https://issues.apache.org/jira/browse/TS-1151
 Project: Traffic Server
  Issue Type: Bug
Reporter: Zhao Yongming
Assignee: Zhao Yongming


we get some strange crash, the manager  cop may die, we are not sure what that 
is, but I'd like to start one Issue here if we have other same issue.

here is the log in /var/log/messages

{code}
Mar 19 10:11:06 cache162.cn77 kernel:: [2510081.212455] [ET_NET 3][319]: 
segfault at 2aaae6e986bc ip 003f7f27bdbe sp 40be2188 error 4 in 
libc-2.5.so[3f7f20+14d000]
Mar 19 10:11:09 cache162.cn77 traffic_manager[305]: {0x7fd3a665c720} FATAL: 
[LocalManager::pollMgmtProcessServer] Error in read (errno: 104)
Mar 19 10:11:09 cache162.cn77 traffic_manager[305]: {0x7fd3a665c720} FATAL:  
(last system error 104: Connection reset by peer)
Mar 19 10:11:09 cache162.cn77 traffic_manager[305]: {0x7fd3a665c720} ERROR: 
[LocalManager::sendMgmtMsgToProcesses] Error writing message
Mar 19 10:11:09 cache162.cn77 traffic_manager[305]: {0x7fd3a665c720} ERROR:  
(last system error 32: Broken pipe)
Mar 19 10:11:09 cache162.cn77 traffic_cop[303]: cop received child status 
signal [305 2816]
Mar 19 10:11:09 cache162.cn77 traffic_cop[303]: traffic_manager not running, 
making sure traffic_server is dead
Mar 19 10:11:09 cache162.cn77 traffic_cop[303]: spawning traffic_manager
Mar 19 10:11:16 cache162.cn77 traffic_manager[1227]: NOTE: --- Manager Starting 
---
Mar 19 10:11:16 cache162.cn77 traffic_manager[1227]: NOTE: Manager Version: 
Apache Traffic Server - traffic_manager - 3.0.2 - (build # 299 on Mar  9 2012 
at 09:55:44)
Mar 19 10:11:16 cache162.cn77 traffic_manager[1227]: {0x7f8ae2f48720} STATUS: 
opened /var/log/trafficserver/manager.log
Mar 19 10:11:23 cache162.cn77 traffic_cop[303]: (cli test) unable to retrieve 
manager_binary
Mar 19 10:11:39 cache162.cn77 traffic_server[1260]: NOTE: --- Server Starting 
---
Mar 19 10:11:39 cache162.cn77 traffic_server[1260]: NOTE: Server Version: 
Apache Traffic Server - traffic_server - 3.0.2 - (build # 299 on Mar  9 2012 at 
09:56:00)
Mar 19 10:11:46 cache162.cn77 traffic_server[1260]: {0x2ad4afd3d970} STATUS: 
opened /var/log/trafficserver/diags.log
Mar 19 10:15:06 cache162.cn77 kernel:: [2510320.713808] [ET_NET 3][1277]: 
segfault at 2aab1cfa6a03 ip 003f7f27bdbe sp 4141c188 error 4 in 
libc-2.5.so[3f7f20+14d000]
Mar 19 10:15:06 cache162.cn77 traffic_manager[1227]: {0x7f8ae2f48720} ERROR: 
[LocalManager::pollMgmtProcessServer] Server Process terminated due to Sig 11: 
Segmentation fault
Mar 19 10:15:06 cache162.cn77 traffic_manager[1227]: {0x7f8ae2f48720} ERROR:  
(last system error 2: No such file or directory)
Mar 19 10:15:06 cache162.cn77 traffic_manager[1227]: {0x7f8ae2f48720} ERROR: 
[Alarms::signalAlarm] Server Process was reset
Mar 19 10:15:06 cache162.cn77 traffic_manager[1227]: {0x7f8ae2f48720} ERROR:  
(last system error 2: No such file or directory)
Mar 19 10:15:08 cache162.cn77 traffic_server[2412]: NOTE: --- Server Starting 
---
Mar 19 10:15:08 cache162.cn77 traffic_server[2412]: NOTE: Server Version: 
Apache Traffic Server - traffic_server - 3.0.2 - (build # 299 on Mar  9 2012 at 
09:56:00)
Mar 19 10:15:08 cache162.cn77 traffic_server[2412]: {0x2af4c2ad5970} STATUS: 
opened /var/log/trafficserver/diags.log
Mar 19 10:54:53 cache162.cn77 ops.hdmon.power: [ OK ] Power Unit PSU 1: 
OK;Power Unit PSU 2: OK.
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (TS-1151) in some strange situation, cop will crash

2012-03-18 Thread Zhao Yongming (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/TS-1151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhao Yongming updated TS-1151:
--

Description: 
we get some strange crash, the manager  cop may die, we are not sure what that 
is, but I'd like to start one Issue here if we have other same issue.

here is the log in /var/log/messages
{code}
Mar 19 10:08:24 cache172.cn77 kernel:: [1553138.961401] [ET_NET 2][17949]: 
segfault at 2aadf1387937 ip 003c5bc7bdbe sp 410f3188 error 4 in 
libc-2.5.so[3c5bc0+14d000]
Mar 19 10:08:27 cache172.cn77 traffic_manager[17935]: {0x7ff0c8d51720} FATAL: 
[LocalManager::pollMgmtProcessServer] Error in read (errno: 104)
Mar 19 10:08:27 cache172.cn77 traffic_manager[17935]: {0x7ff0c8d51720} FATAL:  
(last system error 104: Connection reset by peer)
Mar 19 10:08:27 cache172.cn77 traffic_manager[17935]: {0x7ff0c8d51720} ERROR: 
[LocalManager::sendMgmtMsgToProcesses] Error writing message
Mar 19 10:08:27 cache172.cn77 traffic_manager[17935]: {0x7ff0c8d51720} ERROR:  
(last system error 32: Broken pipe)
Mar 19 10:08:33 cache172.cn77 traffic_cop[17933]: cop received child status 
signal [17935 2816]
Mar 19 10:08:33 cache172.cn77 traffic_cop[17933]: traffic_manager not running, 
making sure traffic_server is dead
Mar 19 10:08:33 cache172.cn77 traffic_cop[17933]: spawning traffic_manager
Mar 19 10:08:40 cache172.cn77 traffic_manager[2760]: NOTE: --- Manager Starting 
---
Mar 19 10:08:40 cache172.cn77 traffic_manager[2760]: NOTE: Manager Version: 
Apache Traffic Server - traffic_manager - 3.0.2 - (build # 299 on Mar  9 2012 
at 09:55:44)
Mar 19 10:08:40 cache172.cn77 traffic_manager[2760]: {0x7fd03d265720} STATUS: 
opened /var/log/trafficserver/manager.log
Mar 19 10:08:46 cache172.cn77 traffic_cop[17933]: (cli test) unable to retrieve 
manager_binary
Mar 19 10:08:54 cache172.cn77 traffic_server[2789]: NOTE: --- Server Starting 
---
Mar 19 10:08:54 cache172.cn77 traffic_server[2789]: NOTE: Server Version: 
Apache Traffic Server - traffic_server - 3.0.2 - (build # 299 on Mar  9 2012 at 
09:56:00)
Mar 19 10:09:00 cache172.cn77 traffic_server[2789]: {0x2b5a8ef03970} STATUS: 
opened /var/log/trafficserver/diags.log
Mar 19 10:14:02 cache172.cn77 kernel:: [1553476.364204] [ET_NET 0][2789]: 
segfault at 2aab1fa99ce3 ip 003c5bc7bdbe sp 7fff39743fa8 error 4 in 
libc-2.5.so[3c5bc0+14d000]
Mar 19 10:14:03 cache172.cn77 traffic_manager[2760]: {0x7fd03d265720} FATAL: 
[LocalManager::pollMgmtProcessServer] Error in read (errno: 104)
Mar 19 10:14:03 cache172.cn77 traffic_manager[2760]: {0x7fd03d265720} FATAL:  
(last system error 104: Connection reset by peer)
Mar 19 10:14:03 cache172.cn77 traffic_manager[2760]: {0x7fd03d265720} ERROR: 
[LocalManager::sendMgmtMsgToProcesses] Error writing message
Mar 19 10:14:03 cache172.cn77 traffic_manager[2760]: {0x7fd03d265720} ERROR:  
(last system error 32: Broken pipe)
{code}

here is the message in traffic.out
{code}
Mar 19 10:11:06 cache162.cn77 kernel:: [2510081.212455] [ET_NET 3][319]: 
segfault at 2aaae6e986bc ip 003f7f27bdbe sp 40be2188 error 4 in 
libc-2.5.so[3f7f20+14d000]
Mar 19 10:11:09 cache162.cn77 traffic_manager[305]: {0x7fd3a665c720} FATAL: 
[LocalManager::pollMgmtProcessServer] Error in read (errno: 104)
Mar 19 10:11:09 cache162.cn77 traffic_manager[305]: {0x7fd3a665c720} FATAL:  
(last system error 104: Connection reset by peer)
Mar 19 10:11:09 cache162.cn77 traffic_manager[305]: {0x7fd3a665c720} ERROR: 
[LocalManager::sendMgmtMsgToProcesses] Error writing message
Mar 19 10:11:09 cache162.cn77 traffic_manager[305]: {0x7fd3a665c720} ERROR:  
(last system error 32: Broken pipe)
Mar 19 10:11:09 cache162.cn77 traffic_cop[303]: cop received child status 
signal [305 2816]
Mar 19 10:11:09 cache162.cn77 traffic_cop[303]: traffic_manager not running, 
making sure traffic_server is dead
Mar 19 10:11:09 cache162.cn77 traffic_cop[303]: spawning traffic_manager
Mar 19 10:11:16 cache162.cn77 traffic_manager[1227]: NOTE: --- Manager Starting 
---
Mar 19 10:11:16 cache162.cn77 traffic_manager[1227]: NOTE: Manager Version: 
Apache Traffic Server - traffic_manager - 3.0.2 - (build # 299 on Mar  9 2012 
at 09:55:44)
Mar 19 10:11:16 cache162.cn77 traffic_manager[1227]: {0x7f8ae2f48720} STATUS: 
opened /var/log/trafficserver/manager.log
Mar 19 10:11:23 cache162.cn77 traffic_cop[303]: (cli test) unable to retrieve 
manager_binary
Mar 19 10:11:39 cache162.cn77 traffic_server[1260]: NOTE: --- Server Starting 
---
Mar 19 10:11:39 cache162.cn77 traffic_server[1260]: NOTE: Server Version: 
Apache Traffic Server - traffic_server - 3.0.2 - (build # 299 on Mar  9 2012 at 
09:56:00)
Mar 19 10:11:46 cache162.cn77 traffic_server[1260]: {0x2ad4afd3d970} STATUS: 
opened /var/log/trafficserver/diags.log
Mar 19 10:15:06 cache162.cn77 kernel:: [2510320.713808] [ET_NET 3][1277]: 
segfault at 2aab1cfa6a03 ip 003f7f27bdbe sp 4141c188 error 4 in