[ https://issues.apache.org/jira/browse/SPARK-4817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14241001#comment-14241001 ]
Sean Owen commented on SPARK-4817: ---------------------------------- I'm not sure if this is what you're looking for, but you can always call {{foreachRDD}}, and operate on all of the RDD, and then call {{take}} on the RDD to get a few elements to print. Your PR reimplements the same thing less efficiently. But this is not what your example above does. Neither prints the "top" elements. Did you mean "first"? > [streaming]Print the specified number of data and handle all of the elements > in RDD > ----------------------------------------------------------------------------------- > > Key: SPARK-4817 > URL: https://issues.apache.org/jira/browse/SPARK-4817 > Project: Spark > Issue Type: New Feature > Components: Streaming > Reporter: 宿荣全 > Priority: Minor > > Dstream.print function:Print 10 elements and handle 11 elements. > A new function based on Dstream.print function is presented: > the new function: > Print the specified number of data and handle all of the elements in RDD. > there is a work scene: > val dstream = stream.map->filter->mapPartitions->print > the data after filter need update database in mapPartitions,but don't need > print each data,only need to print the top 20 for view the data processing. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org