I'm not 100% sure I understand your question, but yes, Spark (both the RDD API and SQL/DataFrame) does partial aggregation.
On Tue, Feb 9, 2016 at 8:37 PM, Rishitesh Mishra <[email protected]> wrote: > Can anybody confirm, whether ANY operator in Spark SQL uses > map-side-combine ? If not, is it safe to assume SortShuffleManager will > always use Serialized sorting in case of queries from Spark SQL ? >
