Shelley Lynn Hughes-Godfrey created GEODE-6200:
--------------------------------------------------

             Summary: CI: netstat --with-lsof fails with OOME (when netstat 
command not found)
                 Key: GEODE-6200
                 URL: https://issues.apache.org/jira/browse/GEODE-6200
             Project: Geode
          Issue Type: Bug
          Components: gfsh
            Reporter: Shelley Lynn Hughes-Godfrey


org.apache.geode.management.internal.cli.NetstatDUnitTest > 
testOutputToConsoleWithLsofForOneMember FAILED

{noformat}
java.lang.OutOfMemoryError: Java heap space
Dumping heap to java_pid1.hprof ...

org.apache.geode.management.internal.cli.NetstatDUnitTest > 
testOutputToConsoleWithLsofForOneMember FAILED
    java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOf(Arrays.java:3332)
        at 
java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
        at 
java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:649)
        at java.lang.StringBuilder.append(StringBuilder.java:202)
        at org.json.JSONStringer.string(JSONStringer.java:369)
        at org.json.JSONStringer.value(JSONStringer.java:262)
        at org.json.JSONArray.writeTo(JSONArray.java:732)
        at org.json.JSONStringer.value(JSONStringer.java:231)
        at org.json.JSONObject.writeTo(JSONObject.java:882)
        at org.json.JSONStringer.value(JSONStringer.java:235)
        at org.json.JSONObject.writeTo(JSONObject.java:882)
        at org.json.JSONObject.toString(JSONObject.java:849)
        at 
org.apache.geode.management.internal.cli.json.GfJsonObject.toString(GfJsonObject.java:301)
        at java.lang.String.valueOf(String.java:2994)
        at java.lang.StringBuilder.append(StringBuilder.java:131)
        at 
org.apache.geode.management.internal.cli.result.CommandResult.toString(CommandResult.java:508)
        at 
org.apache.geode.management.internal.cli.NetstatDUnitTest.testOutputToConsoleWithLsofForOneMember(NetstatDUnitTest.java:104)
{noformat}

=-=-=-=-=-=-=-=-=-=-=-=-=-=  Test Results Website 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
http://s3proxy.gemfire.pivotal.io/gemfire-test-results/9.5/distributedTest/1544666867/index.html
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

To download the test artifacts from this job, execute the following command 
after the job has completed:

     aws s3 cp 
s3://gemfire-build-artifacts/9.5/9.5.3-build.2/1544666867/distributedtestfiles-9.5.3-build.2.tgz
 .

This failure initially looks like GEODE-2488 ... which was fixed in March 2018. 
 GEODE-2488 marked the --with-lsof tests as @Ignore (tagged with this bug).  
Later, the commit below added the following test 
(testOutputToConsoleWithLsofForOneMember) ... so once again we are doing a 
netstat --with-lsof which is producing a huge amount of output ... all read 
into a single buffer for parsing which leads us to declare OOME.  I don't think 
this output is from a successful execution of the netstat command though -- the 
test output shows the netstat command is not found (see below).

{noformat}
commit d2b263f9053f293a409c527d9c8b5ae17b745041
Author: Jens Deppe <jde...@pivotal.io>
Date:   Fri Jun 22 15:33:20 2018 -0700

    GEODE-5335: Do not resolve addresses when calling netstat and lsof (#2070)

    - This avoids long command pauses (or failures) if DNS is slow or
      misconfigured.
    - Add more netstat tests
   
    (cherry picked from commit 908a5efe59c4a81be647bb82ba58a4ccba98e1ac)
{noformat}

{noformat}
+  public void testOutputToConsoleWithLsofForOneMember() throws Exception {
+    CommandResult result = gfsh.executeCommand("netstat --member=server-1 
--with-lsof");
+    assertThat(result.getStatus()).isEqualTo(Result.Status.OK);
+
+    String rawOutput = result.getMessageFromContent();
+    String[] lines = rawOutput.split("\n");
+
+    assertThat(lines.length).isGreaterThan(5);
+    
assertThat(lines[4].trim().split("[,\\s]+")).containsExactlyInAnyOrder("server-1");
+    assertThat(lines).filteredOn(e -> e.contains("## lsof output 
##")).hasSize(1);
+  }
{noformat}

Interestingly, it looks like netstat fails here (from test output):
{noformat}
Command result for <netstat --member=server-1 --with-lsof>: 
##########################################################
Host: ebc7313d51a3
OS: Linux 4.15.0-38-generic amd64
Member(s):
 server-1
##########################################################
Could not execute "netstat". Reason: Cannot run program "netstat": error=2, No 
such file or directory
{noformat}

The output seems to be a huge listing ... starting with this:
{noformat}
################ lsof output ###################
COMMAND PID TID USER   FD      TYPE             DEVICE SIZE/OFF     NODE NAME
java      1     root  cwd       DIR               0,59       44   280305 
/tmp/build/ae3c03f4/built-gemfire/test/geode/geode-core/build/distributedTest1562
java      1     root  rtd       DIR              0,102       80   234603 /
java      1     root  txt       REG              0,102     8464   161745 
/usr/lib/jvm/java-8-oracle/jre/bin/java
java      1     root  mem       REG               0,67            161745 
/usr/lib/jvm/java-8-oracle/jre/bin/java (path dev=0,102)
java      1     root  mem       REG               0,67            162079 
/usr/lib/jvm/java-8-oracle/jre/lib/resources.jar (path dev=0,102)
java      1     root  mem       REG               0,67            161955 
/usr/lib/jvm/java-8-oracle/jre/lib/ext/cldrdata.jar (path dev=0,102)
java      1     root  mem       REG               0,67            161959 
/usr/lib/jvm/java-8-oracle/jre/lib/ext/localedata.jar (path dev=0,102)
java      1     root  mem       REG               0,67            161961 
/usr/lib/jvm/java-8-oracle/jre/lib/ext/nashorn.jar (path dev=0,102)
java      1     root  mem       REG               0,67            161810 
/usr/lib/jvm/java-8-oracle/jre/lib/amd64/libmanagement.so (path dev=0,102)
java      1     root  mem       REG               0,67            142155 
/lib/x86_64-linux-gnu/libgcc_s.so.1 (path dev=0,102)
java      1     root  mem       REG               0,67            169374 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21 (path dev=0,102)
java      1     root  mem       REG               0,44             13472 
/tmp/build/ae3c03f4/cache/gradle/native/25/linux-amd64/libnative-platform.so 
(path dev=0,58)
{noformat}

Note that this is not new ... we see this 56 days ago (9.5.2 build 10):

http://concourse.gemfire.pivotal.io/teams/main/pipelines/gemfire-9.5/jobs/DistributedTest/builds/59
{noformat}
java.lang.OutOfMemoryError: Java heap space
Dumping heap to java_pid1.hprof ...

org.apache.geode.management.internal.cli.NetstatDUnitTest > 
testOutputToConsoleWithLsofForOneMember FAILED
    java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOf(Arrays.java:3332)
        at 
java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
        at 
java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:649)
        at java.lang.StringBuilder.append(StringBuilder.java:202)
        at java.util.AbstractCollection.toString(AbstractCollection.java:464)
        at java.util.Vector.toString(Vector.java:1003)
        at java.lang.String.valueOf(String.java:2994)
        at java.lang.StringBuilder.append(StringBuilder.java:131)
        at 
org.apache.geode.management.internal.cli.result.CommandResult.toString(CommandResult.java:508)
        at 
org.apache.geode.management.internal.cli.NetstatDUnitTest.testOutputToConsoleWithLsofForOneMember(NetstatDUnitTest.java:104)
Heap dump file created [442340167 bytes in 1.462 secs]
{noformat}

logs show:
{noformat}
Command result for <netstat --with-lsof=true 
--file=/tmp/junit1796957499625851049/junit2425143231094040391/command.log.txt>: 

Saved netstat output in the file 
/tmp/junit1796957499625851049/junit2425143231094040391/command.log.txt.


Command result for <netstat>: 
########################################################
Host: 9aebab1d2525
OS: Linux 4.4.0-89-generic amd64
Member(s):
 server-1, locator-0, server-2
########################################################
Could not execute "netstat". Reason: Cannot run program "netstat": error=2, No 
such file or directory
{noformat}

If netstat isn't found ... are these tests even doing what they are supposed to?






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to