Using KMeansDriver leaves open files and can lead to FileNotFoundException - 
"too many open files" error
--------------------------------------------------------------------------------------------------------

                 Key: MAHOUT-395
                 URL: https://issues.apache.org/jira/browse/MAHOUT-395
             Project: Mahout
          Issue Type: Bug
          Components: Clustering
    Affects Versions: 0.3, 0.2, 0.1, 0.4
            Reporter: Scott Ganyo
            Priority: Critical


KMeansDriver uses isConverged() method to determine if the k-means clustering 
run is complete.  isConverged() has to open each SequenceFIle and read each 
cluster to see if the containing cluster is converged.  During this process the 
readers are not explicitly closed, so in the case where there are a large 
number of sequence files opened, the driving system may run out of file handles 
before they are eventually implicitly reclaimed.  I'm attaching a patch that 
explicitly closes these files as they are no longer needed to remain open.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to