[ https://issues.apache.org/jira/browse/MAHOUT-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Isabel Drost updated MAHOUT-1047: --------------------------------- Attachment: MAHOUT-1047-Show-Leak.patch Attached a hack^Wpatch that might help re-produce the issue: com.carrotsearch.randomizedtesting-runner tracks threads spawned during unit tests. I added RandomizedTest as parent class to our Mahout Test class, marked it such to not complain about leaking threads (there are a few test classes that do not join threads), enabled thread leakage tracking only for the TestCVBModelTrainer. The hack about the patch is adding the randomizedtesting-runner dependency to compile instead of to test scope in mahout/math. To use the patch: make sure you have compiled Mahout trunk at least once with "mvn install", apply the patch with -p1. Than from the project root build at least mahout/math with "mvn --projects math install", after that execute the tweaked test with "mvn --projects core -Dtest=TestCVBModelTrainer test". You should see complaints like the following in the output: Tests in error: » ThreadLeak 32 threads leaked from TEST scope at testInMemoryCVB0(org.apache... » ThreadLeak There are still zombie threads that couldn't be terminated: 1... » ThreadLeak 32 threads leaked from SUITE scope at org.apache.mahout.clusteri... » ThreadLeak There are still zombie threads that couldn't be terminated: 1... > CVB hangs after completion > -------------------------- > > Key: MAHOUT-1047 > URL: https://issues.apache.org/jira/browse/MAHOUT-1047 > Project: Mahout > Issue Type: Bug > Components: Clustering > Affects Versions: 0.7 > Environment: Ubuntu > Reporter: seth boyles > Priority: Minor > Labels: cvb, lda > Fix For: 0.7, 0.8 > > Attachments: MAHOUT-1047-Show-Leak.patch > > > After running the new LDA CVB implementation, it hangs and does not terminate > the process like every other time I run Mahout > Terminal output: > 12/07/19 11:38:49 INFO mapred.LocalJobRunner: > 12/07/19 11:38:49 INFO mapred.Task: Task 'attempt_local_0022_m_000000_0' done. > 12/07/19 11:38:49 INFO mapred.JobClient: map 100% reduce 0% > 12/07/19 11:38:49 INFO mapred.JobClient: Job complete: job_local_0022 > 12/07/19 11:38:49 INFO mapred.JobClient: Counters: 8 > 12/07/19 11:38:49 INFO mapred.JobClient: File Output Format Counters > 12/07/19 11:38:49 INFO mapred.JobClient: Bytes Written=2247793 > 12/07/19 11:38:49 INFO mapred.JobClient: File Input Format Counters > 12/07/19 11:38:49 INFO mapred.JobClient: Bytes Read=1920337 > 12/07/19 11:38:49 INFO mapred.JobClient: FileSystemCounters > 12/07/19 11:38:49 INFO mapred.JobClient: FILE_BYTES_READ=1342812616 > 12/07/19 11:38:49 INFO mapred.JobClient: FILE_BYTES_WRITTEN=1326092302 > 12/07/19 11:38:49 INFO mapred.JobClient: Map-Reduce Framework > 12/07/19 11:38:49 INFO mapred.JobClient: Map input records=2772 > 12/07/19 11:38:49 INFO mapred.JobClient: Spilled Records=0 > 12/07/19 11:38:49 INFO mapred.JobClient: SPLIT_RAW_BYTES=140 > 12/07/19 11:38:49 INFO mapred.JobClient: Map output records=2772 > 12/07/19 11:38:49 INFO driver.MahoutDriver: Program took 4089950 ms (Minutes: > 68.16583333333334) > $MAHOUT_HOME/mahout cvb -i > /home/seth/Scripted/mahout_data/vectors/vectors/vectors-for-cvb/ -o > /home/seth/Scripted/mahout_data/clusters/ -ow -k 90 -dt > /home/seth/Scripted/mahout_data/distributions -dict > /home/seth/Scripted/mahout_data/vectors/vectors/dictionary.file-0 -mt > /home/seth/Scripted/mahout_data/temp/ -x 20 -cd 0.05 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira