This was being discussed in two separate threads:

Greg,

Are you able to find which process is using a port? Perhaps jvisualvm might help determine which test it is? If you can narrow it down to a bunch of test definitions sharing a test class, by identifying the class on the stack, you'll find it.

Sim wrote:
On 09-04-13 11:44, Peter Firmstone wrote:
I've never experienced the issue locally (I see it on Jenkins quite a
lot), but I suspect a stale registrar process left from another test may
be stopping the socket from closing.  Not that registrars are also
simulated for discovery tests, so it may not necessarily be Reggie.

The code is duplicated in two places in superclasses of the tests that
are failing,  the method portInUse(int port) is supposed to check if the
ports available, but only selects from a list of LookupLocator's known
to have started, so doesn't actually check the port's available.

Perhaps you can find the stale test process responsible?

And other jobs get executed on the same machine at the same time. Maybe these consume ports?

The best way to find the problem looks to me like doing a 'lsof -i' in the test as soons as a port turns out used unexpectedly.

Gr. Simon

--
QCG, Software voor het MKB, 071-5890970, http://www.qcg.nl
Quality Consultancy Group b.v., Leiderdorp, Kvk Den Haag: 28088397

Peter wrote:
It's a problem I've been aware of for some time, many tests don't specify a port, they just want three or four or however many registrar's started and they have to specify it using a LookupLocator, unfortunately you can't specify an ephemeral LookupLocator, it's not allowed, so if you don't specify a particular port, you get 4160 and the portInUse method is supposed to work around that.

When a test finds a port in use, it updates the registrar wanted to an available port. If it thinks 4160 is available, when it isn't, boom, test failed.

Regards,

Peter.

Greg Trasuk wrote:
I hope you're not adding this to try to help my configuration issue,
because I'm pretty sure that's not the problem, and I hate to screw
around with the regression tests unneccessarily.

Evidence-wise, the "lookup started != initial lookups wanted" issue
looks like this:
When I enable logging at the FINE level, I see three registrars started,
with unicast discovery on random ports - this is Reggie's fallback
position when it attempts to open its unicast port and it's already in
use.  However, the test that's checking for "lookup started == initial
lookups wanted" is testing for three separate registrars on the default
port.  Obviously that's an impossible proposition.  My working theory is
that the test is supposed to be configured to start Reggie on three
separate unicast ports (e.e. 7700, 7701, 7702), and the assertion is
checking to see if that happened correctly.  This theory is supported by
the existence of
"qa/src/com/sun/jini/test/share/reggie3_2Ports.properties", which
includes properties called "net.jini.core.lookup.ServiceRegistrar.port.0=7700",
"net.jini.core.lookup.ServiceRegistrar.port.1=7701",
"net.jini.core.lookup.ServiceRegistrar.port.2=7702", etc. Hence my suspicion that I don't understand the configuration mechanism
well enough yet.  My theory is that the test is supposed to start up
Reggie's on three different ports, but there is something about my test
environment setup that causes the configuration to be missed.  Anpother
possibility is that the actual test configuration got messed up at some
point (I did see a comment in one of the properties files about "these
entries are commented out because they are now included in the test
descriptions".

Testing that theory is going to take me a little while to understand the
multi-layer test configuration mechanism.  Unfortunately I've got a full
morning of other work ahead of me.

Anyhow, I don't really think changing the tests is the correct approach
at this point (at least in my troubleshooting process).

Cheers,

Greg.



On Tue, 2013-04-09 at 07:05, peter_firmst...@apache.org wrote:
Author: peter_firmstone
Date: Tue Apr  9 11:05:32 2013
New Revision: 1465969

URL: http://svn.apache.org/r1465969
Log:
Be more thorough when checking for port availability on localhost during testing, don't assume port free if current test isn't using it.

Modified:
river/jtsk/skunk/qa_refactor/trunk/qa/src/com/sun/jini/test/share/LookupServices.java

Modified: river/jtsk/skunk/qa_refactor/trunk/qa/src/com/sun/jini/test/share/LookupServices.java URL: http://svn.apache.org/viewvc/river/jtsk/skunk/qa_refactor/trunk/qa/src/com/sun/jini/test/share/LookupServices.java?rev=1465969&r1=1465968&r2=1465969&view=diff ============================================================================== --- river/jtsk/skunk/qa_refactor/trunk/qa/src/com/sun/jini/test/share/LookupServices.java (original) +++ river/jtsk/skunk/qa_refactor/trunk/qa/src/com/sun/jini/test/share/LookupServices.java Tue Apr 9 11:05:32 2013
@@ -27,7 +27,15 @@ import com.sun.jini.test.spec.discoverys
 import com.sun.jini.test.spec.discoveryservice.AbstractBaseTest;
import com.sun.jini.test.spec.discoveryservice.AbstractBaseTest.DiscoveryStruct; import com.sun.jini.test.spec.discoveryservice.AbstractBaseTest.RegGroupsPair;
+import java.io.IOException;
 import java.net.InetAddress;
+import java.net.InetSocketAddress;
+import java.net.ServerSocket;
+import java.net.Socket;
+import java.net.SocketAddress;
+import java.net.SocketException;
+import java.net.SocketOptions;
+import java.net.SocketTimeoutException;
 import java.net.UnknownHostException;
 import java.rmi.RemoteException;
 import java.util.ArrayList;
@@ -660,12 +668,47 @@ public class LookupServices {
      * @return true if port in use.       */
     private boolean portInUse(int port) {
+        if (port == 0) return false; // Ephemeral
         for(int i=0;i<lookupsStarted.size();i++) {
             LocatorGroupsPair pair = lookupsStarted.get(i);
             int curPort = (pair.getLocator()).getPort();
             if(port == curPort) return true;
         }//end loop
-        return false;
+        // Open a client ephemeral socket and attempt to connect to
+        // port on localhost to see if someone's listening.
+        Socket sock = null;
+        try {
+            sock = new Socket();
+            if (sock instanceof SocketOptions){
+ // Socket terminates with a RST rather than a FIN, so there's no TIME_WAIT
+                try {
+ ((SocketOptions) sock).setOption(SocketOptions.SO_LINGER, Integer.valueOf(0));
+                } catch (SocketException se) {
+                    // Ignore, not supported.
+ logger.log( Level.FINEST, "SocketOptions set SO_LINGER threw an Exception", se);
+                }
+            }
+            SocketAddress add = new InetSocketAddress(port);
+ sock.connect(add, 3000); // Try to connect for up to three seconds + // We were able to connect to a socket listening on localhost
+            return true;
+        } catch (SocketTimeoutException e){
+            // There might be a stale process assume in use.
+ logger.log( Level.FINEST, "Socket timed out while trying to connect", e);
+            return true;
+        } catch (IOException e){
+ // There was nothing listening on the socket so it's probably free.
+            // or it timed out.
+            return false;
+        } finally {
+            if (sock != null){
+                try {
+                    sock.close();
+                } catch (IOException ex){
+ logger.log( Level.FINEST, "Socket threw exception while attempting to close", ex);
+                }// Ignore
+            }
+        }
     }//end portInUse
          private void refreshLookupLocatorListsAt(int index){





Reply via email to