[ https://issues.apache.org/jira/browse/SPARK-9246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14646312#comment-14646312 ]
Meihua Wu commented on SPARK-9246: ---------------------------------- Cool. I see. will keep updating about the progress. I have a question: is topDocumentsPerTopic exact or approximate (like describeTopics which, according to ScalaDoc, "may not return exactly the top-weighted terms for each topic; to get a more precise set of top terms, increase maxTermsPerTopic.")? > DistributedLDAModel predict top docs per topic > ---------------------------------------------- > > Key: SPARK-9246 > URL: https://issues.apache.org/jira/browse/SPARK-9246 > Project: Spark > Issue Type: New Feature > Components: MLlib > Reporter: Joseph K. Bradley > Original Estimate: 72h > Remaining Estimate: 72h > > For each topic, return top documents based on topicDistributions. > Synopsis: > {code} > /** > * @param maxDocuments Max docs to return for each topic > * @return Array over topics of (sorted top docs, corresponding doc-topic > weights) > */ > def topDocumentsPerTopic(maxDocuments: Int): Array[(Array[Long], > Array[Double])] > {code} > Note: We will need to make sure that the above return value format is > Java-friendly. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org