At the moment, I am developing a program which consist of Web services. This program is started by one managing Java application. Then ,these Webservies call to each other and repeat treating received data until these finish process. Data sent and received Web service to Webservie are complex type which is serialized or deserialized by BeanSerializer and simple type. Webservies normally run for a while. But, these soon become to be not received. I want to solve the problem. Would you like give advices ?
Environment is as follow. Fedora Core2 , Red Hat 9 java 2 sdk 1.4.2_06 JAVA_OPTS=' -server -Xmx512m ' combination of tomcat and Apache Axis are Axis 1.1 and tomcat 4.1.31 Axis 1.2rc2 and tomcat 4.1.31 Axis 1.1 and tomcat 5.5.4 Axis 1.2rc2 and tomcat 5.5.4 The program broken down at above all environments. I show that I checked out until now . (1)First, I suspected that these Webservies had dead lock, so I analyzed thread dump. But, I could not find dead lock. (2)I tried to run netstat at PC where Webservices is running. In the result I found that Time_Wait sockets was increasing over a short amount of time at PC where Webservices is running. I thought that this cause Webservies become not to receive data. So, I make the program decrease communication frequency not to increase Time_Wait sockets. In the result, Time_Wait sockets decrease. But problem is not solved. (3) I thought that the problem was similar to http://issues.apache.org/bugzilla/buglist.cgi?bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&product=Axis "Connections not closed in http client code.". According to http://marc.theaimsgroup.com/?l=axis-dev&m=108650171131081&w=2 , this Bug was solved . therefore, I thought that Axis 1.2rc2 is fixed. I ran the program at Axis 1.2rc2. But problem is not solved. (4)After Webservices can't receive data, I ran tcpdump PC where Webservices is running and checked out port (8080) using axis(tomcat). the result is shown as follow. 115550:116998(1448) ack 1 win 5840 <nop,nop,timestamp 2054696 2070801> (DF) . ack 116998 win 31856 <nop,nop,timestamp 2070802 2054696> (DF) P 116998:118446(1448) ack 1 win 5840 <nop,nop,timestamp 2054696 2070801> (DF) . 118446:119894(1448) ack 1 win 5840 <nop,nop,timestamp 2054696 2070801> (DF) . ack 119894 win 31856 <nop,nop,timestamp 2070803 2054696> (DF) . 119894:121342(1448) ack 1 win 5840 <nop,nop,timestamp 2054696 2070801> (DF) . 121342:122790(1448) ack 1 win 5840 <nop,nop,timestamp 2054696 2070801> (DF) . ack 122790 win 31856 <nop,nop,timestamp 2070803 2054696> (DF) : . 122790:124238(1448) ack 1 win 5840 <nop,nop,timestamp 2054696 2070801> (DF) : . 124238:125686(1448) ack 1 win 5840 <nop,nop,timestamp 2054696 2070801> (DF) : . ack 125686 win 31856 <nop,nop,timestamp 2070803 2054696> (DF) (The rest is omitted.) >From the result , I seem , though that PC receive data , tomcat or axis can't >be received data . (5) I suspected that date are too many between Webservice and Webservice. Therefore, I made data between Webservice and Webservice decrease. But problem is not solved. (6)Related to (5), I tried to send or receive data through GZIP encording on http transport. But problem is not solved. (7)I suspected the program's timeout setting. Therefore, I tried to set timeout(0), timeout(6000), etc . . But problem is not solved. I show that the problem further. (8)Even if it seem the program has stopped, After wait for a while, the program run a little. But, the program stop again. Then the program stop in the end. (9)At the moment, I am developing while I an logining with ssh and having an eye on tomcat. After the program completely stopped, I tried to stop tomcat. Then if I login again and don't kill -9 (tomcat PID), There are PCs which I can't make tomcat stop. (10) Errors(OutOfMemoryError, etc.) were not thrown by Webservices. Hiroshi Takahashi thanks.