Rather this is a fundamental question: Was it an architectural constraint that collect action always returns the results to the driver? It is gobbling up all the driver’s memory ( in case of cache ) and why can’t we have an exclusive executor that shares the load and “somehow” merge the results? If it were the design, I am flummoxed.
-RK -- Collective[i] dramatically improves sales and marketing performance using technology, applications and a revolutionary network designed to provide next generation analytics and decision-support directly to business users. Our goal is to maximize human potential and minimize mistakes. In most cases, the results are astounding. We cannot, however, stop emails from sometimes being sent to the wrong person. If you are not the intended recipient, please notify us by replying to this email's sender and deleting it (and any attachments) permanently from your system. If you are, please respect the confidentiality of this communication's contents. --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org