[jira] [Commented] (LUCENE-5786) Unflushed/ truncated events file (hung testing subprocess)

2014-07-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14048792#comment-14048792
 ] 

ASF subversion and git services commented on LUCENE-5786:
-

Commit 1607060 from [~dawidweiss] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1607060 ]

LUCENE-5786: Unflushed/ truncated events file (hung testing subprocess). 
Updating RR to 2.1.6

> Unflushed/ truncated events file (hung testing subprocess)
> --
>
> Key: LUCENE-5786
> URL: https://issues.apache.org/jira/browse/LUCENE-5786
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/test
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
> Fix For: 5.0, 4.10
>
>
> This has happened several times on Jenkins, typically on 
> SSLMigrationTest.testDistribSearch, but probably on other tests as well.
> The symptom is: the test framework never terminates, it also reports an 
> incorrect (?) hung test.
> The problem is that the actual forked JVM is hung on reading stdin, waiting 
> for the next test suite (no test thread is present); the master process is 
> hung on receiving data from the forked jvm (both the events file and stdout 
> spill is truncated in the middle of a test). The last output is:
> {code}
> [
>   "APPEND_STDERR",
>   {
> "chunk": "612639 T30203 oasu.DefaultSolrCoreState.doRecovery Running 
> recovery - first canceling any ongoing recovery%0A"
>   }
> ]
> [
>   "APPEND_STDERR"
> {code}
> Overall, it looks insane -- there are flushes after each test completes 
> (normally or not), there are tests *following* the one that last reported 
> output and before dynamic suites on stdin. 
> I have no idea. The best explanation is insane -- looks like the test thread 
> just died in the middle of executing Java code...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5786) Unflushed/ truncated events file (hung testing subprocess)

2014-07-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14048790#comment-14048790
 ] 

ASF subversion and git services commented on LUCENE-5786:
-

Commit 1607058 from [~dawidweiss] in branch 'dev/trunk'
[ https://svn.apache.org/r1607058 ]

LUCENE-5786: Unflushed/ truncated events file (hung testing subprocess). 
Updating RR to 2.1.6

> Unflushed/ truncated events file (hung testing subprocess)
> --
>
> Key: LUCENE-5786
> URL: https://issues.apache.org/jira/browse/LUCENE-5786
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/test
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
> Fix For: 5.0, 4.10
>
>
> This has happened several times on Jenkins, typically on 
> SSLMigrationTest.testDistribSearch, but probably on other tests as well.
> The symptom is: the test framework never terminates, it also reports an 
> incorrect (?) hung test.
> The problem is that the actual forked JVM is hung on reading stdin, waiting 
> for the next test suite (no test thread is present); the master process is 
> hung on receiving data from the forked jvm (both the events file and stdout 
> spill is truncated in the middle of a test). The last output is:
> {code}
> [
>   "APPEND_STDERR",
>   {
> "chunk": "612639 T30203 oasu.DefaultSolrCoreState.doRecovery Running 
> recovery - first canceling any ongoing recovery%0A"
>   }
> ]
> [
>   "APPEND_STDERR"
> {code}
> Overall, it looks insane -- there are flushes after each test completes 
> (normally or not), there are tests *following* the one that last reported 
> output and before dynamic suites on stdin. 
> I have no idea. The best explanation is insane -- looks like the test thread 
> just died in the middle of executing Java code...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5786) Unflushed/ truncated events file (hung testing subprocess)

2014-06-30 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14047452#comment-14047452
 ] 

Uwe Schindler commented on LUCENE-5786:
---

I updated the OpenJDK port on lucene-zones.apache.org.

> Unflushed/ truncated events file (hung testing subprocess)
> --
>
> Key: LUCENE-5786
> URL: https://issues.apache.org/jira/browse/LUCENE-5786
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/test
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>
> This has happened several times on Jenkins, typically on 
> SSLMigrationTest.testDistribSearch, but probably on other tests as well.
> The symptom is: the test framework never terminates, it also reports an 
> incorrect (?) hung test.
> The problem is that the actual forked JVM is hung on reading stdin, waiting 
> for the next test suite (no test thread is present); the master process is 
> hung on receiving data from the forked jvm (both the events file and stdout 
> spill is truncated in the middle of a test). The last output is:
> {code}
> [
>   "APPEND_STDERR",
>   {
> "chunk": "612639 T30203 oasu.DefaultSolrCoreState.doRecovery Running 
> recovery - first canceling any ongoing recovery%0A"
>   }
> ]
> [
>   "APPEND_STDERR"
> {code}
> Overall, it looks insane -- there are flushes after each test completes 
> (normally or not), there are tests *following* the one that last reported 
> output and before dynamic suites on stdin. 
> I have no idea. The best explanation is insane -- looks like the test thread 
> just died in the middle of executing Java code...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5786) Unflushed/ truncated events file (hung testing subprocess)

2014-06-29 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14047437#comment-14047437
 ] 

Dawid Weiss commented on LUCENE-5786:
-

The problem is caused by stack overflow in Solr code (SOLR-6213) which in turn 
breaks the forked JVM's communication layer with the master test controller. 
Socket interrupt issue is probably triggering the bug.

Unfixable at test framework layer I think. I'll try to improve logging and make 
the forked JVM at least quit in such a scenario.


> Unflushed/ truncated events file (hung testing subprocess)
> --
>
> Key: LUCENE-5786
> URL: https://issues.apache.org/jira/browse/LUCENE-5786
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/test
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>
> This has happened several times on Jenkins, typically on 
> SSLMigrationTest.testDistribSearch, but probably on other tests as well.
> The symptom is: the test framework never terminates, it also reports an 
> incorrect (?) hung test.
> The problem is that the actual forked JVM is hung on reading stdin, waiting 
> for the next test suite (no test thread is present); the master process is 
> hung on receiving data from the forked jvm (both the events file and stdout 
> spill is truncated in the middle of a test). The last output is:
> {code}
> [
>   "APPEND_STDERR",
>   {
> "chunk": "612639 T30203 oasu.DefaultSolrCoreState.doRecovery Running 
> recovery - first canceling any ongoing recovery%0A"
>   }
> ]
> [
>   "APPEND_STDERR"
> {code}
> Overall, it looks insane -- there are flushes after each test completes 
> (normally or not), there are tests *following* the one that last reported 
> output and before dynamic suites on stdin. 
> I have no idea. The best explanation is insane -- looks like the test thread 
> just died in the middle of executing Java code...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5786) Unflushed/ truncated events file (hung testing subprocess)

2014-06-28 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14046984#comment-14046984
 ] 

Dawid Weiss commented on LUCENE-5786:
-

I will re-enable all test plans I disabled once I'm done testing. I wanted a 
faster build cycle. It shouldn't take too long (if I don't find it until Monday 
I'll probably give up).

> Unflushed/ truncated events file (hung testing subprocess)
> --
>
> Key: LUCENE-5786
> URL: https://issues.apache.org/jira/browse/LUCENE-5786
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/test
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>
> This has happened several times on Jenkins, typically on 
> SSLMigrationTest.testDistribSearch, but probably on other tests as well.
> The symptom is: the test framework never terminates, it also reports an 
> incorrect (?) hung test.
> The problem is that the actual forked JVM is hung on reading stdin, waiting 
> for the next test suite (no test thread is present); the master process is 
> hung on receiving data from the forked jvm (both the events file and stdout 
> spill is truncated in the middle of a test). The last output is:
> {code}
> [
>   "APPEND_STDERR",
>   {
> "chunk": "612639 T30203 oasu.DefaultSolrCoreState.doRecovery Running 
> recovery - first canceling any ongoing recovery%0A"
>   }
> ]
> [
>   "APPEND_STDERR"
> {code}
> Overall, it looks insane -- there are flushes after each test completes 
> (normally or not), there are tests *following* the one that last reported 
> output and before dynamic suites on stdin. 
> I have no idea. The best explanation is insane -- looks like the test thread 
> just died in the middle of executing Java code...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5786) Unflushed/ truncated events file (hung testing subprocess)

2014-06-28 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14046970#comment-14046970
 ] 

Uwe Schindler commented on LUCENE-5786:
---

Would it be possible to keep the artifacts tasks alive? Those do not run tests, 
but they provide (Maven-) artifacts to our users.

> Unflushed/ truncated events file (hung testing subprocess)
> --
>
> Key: LUCENE-5786
> URL: https://issues.apache.org/jira/browse/LUCENE-5786
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/test
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>
> This has happened several times on Jenkins, typically on 
> SSLMigrationTest.testDistribSearch, but probably on other tests as well.
> The symptom is: the test framework never terminates, it also reports an 
> incorrect (?) hung test.
> The problem is that the actual forked JVM is hung on reading stdin, waiting 
> for the next test suite (no test thread is present); the master process is 
> hung on receiving data from the forked jvm (both the events file and stdout 
> spill is truncated in the middle of a test). The last output is:
> {code}
> [
>   "APPEND_STDERR",
>   {
> "chunk": "612639 T30203 oasu.DefaultSolrCoreState.doRecovery Running 
> recovery - first canceling any ongoing recovery%0A"
>   }
> ]
> [
>   "APPEND_STDERR"
> {code}
> Overall, it looks insane -- there are flushes after each test completes 
> (normally or not), there are tests *following* the one that last reported 
> output and before dynamic suites on stdin. 
> I have no idea. The best explanation is insane -- looks like the test thread 
> just died in the middle of executing Java code...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5786) Unflushed/ truncated events file (hung testing subprocess)

2014-06-28 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14046959#comment-14046959
 ] 

Dawid Weiss commented on LUCENE-5786:
-

I've temporarily disabled ALL active configurations on freebsd jenkins. That 
is, the following:
 - Lucene-Solr-NightlyTests-trunk
 - Lucene-Artifacts-trunk
 - Lucene-Solr-Clover-4.x
 - Lucene-Solr-Clover-trunk
 - Lucene-Solr-Maven-4.x
 - Lucene-Solr-Maven-trunk
 - Lucene-Solr-NightlyTests-4.x
 - Lucene-Artifacts-4.x
 - Lucene-Solr-SmokeRelease-4.x
 - Lucene-Solr-SmokeRelease-trunk
 - Lucene-Solr-Tests-4.x-Java7
 - Lucene-Solr-Tests-trunk-Java7
 - Solr-Artifacts-4.x
 - Solr-Artifacts-trunk

Most of these didn't have a successful run in what seems like months, so I 
don't think it's a big problem. Still digging what's causing the death of the 
test thread (apart from socket interrupt issue we know about there's still 
something odd about those tests on freebsd).


> Unflushed/ truncated events file (hung testing subprocess)
> --
>
> Key: LUCENE-5786
> URL: https://issues.apache.org/jira/browse/LUCENE-5786
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/test
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>
> This has happened several times on Jenkins, typically on 
> SSLMigrationTest.testDistribSearch, but probably on other tests as well.
> The symptom is: the test framework never terminates, it also reports an 
> incorrect (?) hung test.
> The problem is that the actual forked JVM is hung on reading stdin, waiting 
> for the next test suite (no test thread is present); the master process is 
> hung on receiving data from the forked jvm (both the events file and stdout 
> spill is truncated in the middle of a test). The last output is:
> {code}
> [
>   "APPEND_STDERR",
>   {
> "chunk": "612639 T30203 oasu.DefaultSolrCoreState.doRecovery Running 
> recovery - first canceling any ongoing recovery%0A"
>   }
> ]
> [
>   "APPEND_STDERR"
> {code}
> Overall, it looks insane -- there are flushes after each test completes 
> (normally or not), there are tests *following* the one that last reported 
> output and before dynamic suites on stdin. 
> I have no idea. The best explanation is insane -- looks like the test thread 
> just died in the middle of executing Java code...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5786) Unflushed/ truncated events file (hung testing subprocess)

2014-06-27 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14045778#comment-14045778
 ] 

ASF subversion and git services commented on LUCENE-5786:
-

Commit 1606005 from [~dawidweiss] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1606005 ]

LUCENE-5786: adding more debugging to the test framework.

> Unflushed/ truncated events file (hung testing subprocess)
> --
>
> Key: LUCENE-5786
> URL: https://issues.apache.org/jira/browse/LUCENE-5786
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/test
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>
> This has happened several times on Jenkins, typically on 
> SSLMigrationTest.testDistribSearch, but probably on other tests as well.
> The symptom is: the test framework never terminates, it also reports an 
> incorrect (?) hung test.
> The problem is that the actual forked JVM is hung on reading stdin, waiting 
> for the next test suite (no test thread is present); the master process is 
> hung on receiving data from the forked jvm (both the events file and stdout 
> spill is truncated in the middle of a test). The last output is:
> {code}
> [
>   "APPEND_STDERR",
>   {
> "chunk": "612639 T30203 oasu.DefaultSolrCoreState.doRecovery Running 
> recovery - first canceling any ongoing recovery%0A"
>   }
> ]
> [
>   "APPEND_STDERR"
> {code}
> Overall, it looks insane -- there are flushes after each test completes 
> (normally or not), there are tests *following* the one that last reported 
> output and before dynamic suites on stdin. 
> I have no idea. The best explanation is insane -- looks like the test thread 
> just died in the middle of executing Java code...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5786) Unflushed/ truncated events file (hung testing subprocess)

2014-06-27 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14045775#comment-14045775
 ] 

ASF subversion and git services commented on LUCENE-5786:
-

Commit 1606002 from [~dawidweiss] in branch 'dev/trunk'
[ https://svn.apache.org/r1606002 ]

LUCENE-5786: adding more debugging to the test framework.

> Unflushed/ truncated events file (hung testing subprocess)
> --
>
> Key: LUCENE-5786
> URL: https://issues.apache.org/jira/browse/LUCENE-5786
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/test
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>
> This has happened several times on Jenkins, typically on 
> SSLMigrationTest.testDistribSearch, but probably on other tests as well.
> The symptom is: the test framework never terminates, it also reports an 
> incorrect (?) hung test.
> The problem is that the actual forked JVM is hung on reading stdin, waiting 
> for the next test suite (no test thread is present); the master process is 
> hung on receiving data from the forked jvm (both the events file and stdout 
> spill is truncated in the middle of a test). The last output is:
> {code}
> [
>   "APPEND_STDERR",
>   {
> "chunk": "612639 T30203 oasu.DefaultSolrCoreState.doRecovery Running 
> recovery - first canceling any ongoing recovery%0A"
>   }
> ]
> [
>   "APPEND_STDERR"
> {code}
> Overall, it looks insane -- there are flushes after each test completes 
> (normally or not), there are tests *following* the one that last reported 
> output and before dynamic suites on stdin. 
> I have no idea. The best explanation is insane -- looks like the test thread 
> just died in the middle of executing Java code...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5786) Unflushed/ truncated events file (hung testing subprocess)

2014-06-27 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14045776#comment-14045776
 ] 

ASF subversion and git services commented on LUCENE-5786:
-

Commit 1606003 from [~dawidweiss] in branch 'dev/trunk'
[ https://svn.apache.org/r1606003 ]

LUCENE-5786: adding more debugging to the test framework.

> Unflushed/ truncated events file (hung testing subprocess)
> --
>
> Key: LUCENE-5786
> URL: https://issues.apache.org/jira/browse/LUCENE-5786
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/test
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>
> This has happened several times on Jenkins, typically on 
> SSLMigrationTest.testDistribSearch, but probably on other tests as well.
> The symptom is: the test framework never terminates, it also reports an 
> incorrect (?) hung test.
> The problem is that the actual forked JVM is hung on reading stdin, waiting 
> for the next test suite (no test thread is present); the master process is 
> hung on receiving data from the forked jvm (both the events file and stdout 
> spill is truncated in the middle of a test). The last output is:
> {code}
> [
>   "APPEND_STDERR",
>   {
> "chunk": "612639 T30203 oasu.DefaultSolrCoreState.doRecovery Running 
> recovery - first canceling any ongoing recovery%0A"
>   }
> ]
> [
>   "APPEND_STDERR"
> {code}
> Overall, it looks insane -- there are flushes after each test completes 
> (normally or not), there are tests *following* the one that last reported 
> output and before dynamic suites on stdin. 
> I have no idea. The best explanation is insane -- looks like the test thread 
> just died in the middle of executing Java code...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5786) Unflushed/ truncated events file (hung testing subprocess)

2014-06-25 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14044051#comment-14044051
 ] 

Dawid Weiss commented on LUCENE-5786:
-

I think the problem underlying that weird behavior may really be related to 
terminating methods inside native code and their unpredictable behavior on 
FreeBSD -- see SOLR-6204.

> Unflushed/ truncated events file (hung testing subprocess)
> --
>
> Key: LUCENE-5786
> URL: https://issues.apache.org/jira/browse/LUCENE-5786
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/test
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>
> This has happened several times on Jenkins, typically on 
> SSLMigrationTest.testDistribSearch, but probably on other tests as well.
> The symptom is: the test framework never terminates, it also reports an 
> incorrect (?) hung test.
> The problem is that the actual forked JVM is hung on reading stdin, waiting 
> for the next test suite (no test thread is present); the master process is 
> hung on receiving data from the forked jvm (both the events file and stdout 
> spill is truncated in the middle of a test). The last output is:
> {code}
> [
>   "APPEND_STDERR",
>   {
> "chunk": "612639 T30203 oasu.DefaultSolrCoreState.doRecovery Running 
> recovery - first canceling any ongoing recovery%0A"
>   }
> ]
> [
>   "APPEND_STDERR"
> {code}
> Overall, it looks insane -- there are flushes after each test completes 
> (normally or not), there are tests *following* the one that last reported 
> output and before dynamic suites on stdin. 
> I have no idea. The best explanation is insane -- looks like the test thread 
> just died in the middle of executing Java code...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5786) Unflushed/ truncated events file (hung testing subprocess)

2014-06-25 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14043183#comment-14043183
 ] 

Dawid Weiss commented on LUCENE-5786:
-

Switching to FileOutputStream doesn't do anything -- the next build still hung. 
I'm running out of ideas. The event file is truncated at an impossible place, 
the test thread gone.

> Unflushed/ truncated events file (hung testing subprocess)
> --
>
> Key: LUCENE-5786
> URL: https://issues.apache.org/jira/browse/LUCENE-5786
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/test
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>
> This has happened several times on Jenkins, typically on 
> SSLMigrationTest.testDistribSearch, but probably on other tests as well.
> The symptom is: the test framework never terminates, it also reports an 
> incorrect (?) hung test.
> The problem is that the actual forked JVM is hung on reading stdin, waiting 
> for the next test suite (no test thread is present); the master process is 
> hung on receiving data from the forked jvm (both the events file and stdout 
> spill is truncated in the middle of a test). The last output is:
> {code}
> [
>   "APPEND_STDERR",
>   {
> "chunk": "612639 T30203 oasu.DefaultSolrCoreState.doRecovery Running 
> recovery - first canceling any ongoing recovery%0A"
>   }
> ]
> [
>   "APPEND_STDERR"
> {code}
> Overall, it looks insane -- there are flushes after each test completes 
> (normally or not), there are tests *following* the one that last reported 
> output and before dynamic suites on stdin. 
> I have no idea. The best explanation is insane -- looks like the test thread 
> just died in the middle of executing Java code...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5786) Unflushed/ truncated events file (hung testing subprocess)

2014-06-24 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14041836#comment-14041836
 ] 

ASF subversion and git services commented on LUCENE-5786:
-

Commit 1605025 from [~dawidweiss] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1605025 ]

LUCENE-5786: Unflushed/ truncated events file (hung testing subprocess). Point 
at Sonatype's https repo directly.

> Unflushed/ truncated events file (hung testing subprocess)
> --
>
> Key: LUCENE-5786
> URL: https://issues.apache.org/jira/browse/LUCENE-5786
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/test
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>
> This has happened several times on Jenkins, typically on 
> SSLMigrationTest.testDistribSearch, but probably on other tests as well.
> The symptom is: the test framework never terminates, it also reports an 
> incorrect (?) hung test.
> The problem is that the actual forked JVM is hung on reading stdin, waiting 
> for the next test suite (no test thread is present); the master process is 
> hung on receiving data from the forked jvm (both the events file and stdout 
> spill is truncated in the middle of a test). The last output is:
> {code}
> [
>   "APPEND_STDERR",
>   {
> "chunk": "612639 T30203 oasu.DefaultSolrCoreState.doRecovery Running 
> recovery - first canceling any ongoing recovery%0A"
>   }
> ]
> [
>   "APPEND_STDERR"
> {code}
> Overall, it looks insane -- there are flushes after each test completes 
> (normally or not), there are tests *following* the one that last reported 
> output and before dynamic suites on stdin. 
> I have no idea. The best explanation is insane -- looks like the test thread 
> just died in the middle of executing Java code...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5786) Unflushed/ truncated events file (hung testing subprocess)

2014-06-24 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14041832#comment-14041832
 ] 

ASF subversion and git services commented on LUCENE-5786:
-

Commit 1605024 from [~dawidweiss] in branch 'dev/trunk'
[ https://svn.apache.org/r1605024 ]

LUCENE-5786: Unflushed/ truncated events file (hung testing subprocess)

> Unflushed/ truncated events file (hung testing subprocess)
> --
>
> Key: LUCENE-5786
> URL: https://issues.apache.org/jira/browse/LUCENE-5786
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/test
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>
> This has happened several times on Jenkins, typically on 
> SSLMigrationTest.testDistribSearch, but probably on other tests as well.
> The symptom is: the test framework never terminates, it also reports an 
> incorrect (?) hung test.
> The problem is that the actual forked JVM is hung on reading stdin, waiting 
> for the next test suite (no test thread is present); the master process is 
> hung on receiving data from the forked jvm (both the events file and stdout 
> spill is truncated in the middle of a test). The last output is:
> {code}
> [
>   "APPEND_STDERR",
>   {
> "chunk": "612639 T30203 oasu.DefaultSolrCoreState.doRecovery Running 
> recovery - first canceling any ongoing recovery%0A"
>   }
> ]
> [
>   "APPEND_STDERR"
> {code}
> Overall, it looks insane -- there are flushes after each test completes 
> (normally or not), there are tests *following* the one that last reported 
> output and before dynamic suites on stdin. 
> I have no idea. The best explanation is insane -- looks like the test thread 
> just died in the middle of executing Java code...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5786) Unflushed/ truncated events file (hung testing subprocess)

2014-06-24 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14041827#comment-14041827
 ] 

Dawid Weiss commented on LUCENE-5786:
-

I reverted to simple output stream instead of using RAF. Let's see how it goes. 
Will update to RR 2.1.4 in a second.

https://github.com/carrotsearch/randomizedtesting/issues/170

> Unflushed/ truncated events file (hung testing subprocess)
> --
>
> Key: LUCENE-5786
> URL: https://issues.apache.org/jira/browse/LUCENE-5786
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/test
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>
> This has happened several times on Jenkins, typically on 
> SSLMigrationTest.testDistribSearch, but probably on other tests as well.
> The symptom is: the test framework never terminates, it also reports an 
> incorrect (?) hung test.
> The problem is that the actual forked JVM is hung on reading stdin, waiting 
> for the next test suite (no test thread is present); the master process is 
> hung on receiving data from the forked jvm (both the events file and stdout 
> spill is truncated in the middle of a test). The last output is:
> {code}
> [
>   "APPEND_STDERR",
>   {
> "chunk": "612639 T30203 oasu.DefaultSolrCoreState.doRecovery Running 
> recovery - first canceling any ongoing recovery%0A"
>   }
> ]
> [
>   "APPEND_STDERR"
> {code}
> Overall, it looks insane -- there are flushes after each test completes 
> (normally or not), there are tests *following* the one that last reported 
> output and before dynamic suites on stdin. 
> I have no idea. The best explanation is insane -- looks like the test thread 
> just died in the middle of executing Java code...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5786) Unflushed/ truncated events file (hung testing subprocess)

2014-06-23 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14040655#comment-14040655
 ] 

Dawid Weiss commented on LUCENE-5786:
-

I think this may actually be a bug in RR code. Here's why -- writing is done 
via RandomAccessFile and it used to have flush() method but at some point it 
was commented out:
{code}
   @Override
   public void flush() throws IOException {
-raf.getChannel().force(true);
+// This was causing intermittent channel invalidations on Windows for
+// no apparent reason. Also, it shouldn't be a problem if we don't sync
+// with the disk (and use OS cache only)?
+// raf.getChannel().force(true);
   }
{code}

So it either randomly fails on Windows or it will not sync properly on FreeBSD. 
Nice. I'll revert to simple FileOutputStream and see if this improves things.

> Unflushed/ truncated events file (hung testing subprocess)
> --
>
> Key: LUCENE-5786
> URL: https://issues.apache.org/jira/browse/LUCENE-5786
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: general/test
>Reporter: Dawid Weiss
>Assignee: Dawid Weiss
>
> This has happened several times on Jenkins, typically on 
> SSLMigrationTest.testDistribSearch, but probably on other tests as well.
> The symptom is: the test framework never terminates, it also reports an 
> incorrect (?) hung test.
> The problem is that the actual forked JVM is hung on reading stdin, waiting 
> for the next test suite (no test thread is present); the master process is 
> hung on receiving data from the forked jvm (both the events file and stdout 
> spill is truncated in the middle of a test). The last output is:
> {code}
> [
>   "APPEND_STDERR",
>   {
> "chunk": "612639 T30203 oasu.DefaultSolrCoreState.doRecovery Running 
> recovery - first canceling any ongoing recovery%0A"
>   }
> ]
> [
>   "APPEND_STDERR"
> {code}
> Overall, it looks insane -- there are flushes after each test completes 
> (normally or not), there are tests *following* the one that last reported 
> output and before dynamic suites on stdin. 
> I have no idea. The best explanation is insane -- looks like the test thread 
> just died in the middle of executing Java code...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org