SPIP: SPARK-18085: History server enhancements

2017-07-12 Thread Marcelo Vanzin
Hi all, I've requested feedback for this a few times in the past, but since now it's been labeled as an SPIP, I'll do it again. Please take a look and provide any feedback you might have! Also, be aware that development is well under way for this and the current code diverged slightly from the ap

More efficient RDD.count() implementation

2017-07-12 Thread OBones
Hello, As I have written my own data source, I also wrote a custom RDD[Row] implementation to provide getPartitions and compute overrides. This works very well but doing some performance analysis, I see that for any given pipeline fit operation, a fair amount of time is spent in the RDD.count

CVE-2017-7678 Apache Spark XSS web UI MHTML vulnerability

2017-07-12 Thread Sean Owen
Severity: Low Vendor: The Apache Software Foundation Versions Affected: Versions of Apache Spark before 2.2.0 Description: It is possible for an attacker to take advantage of a user's trust in the server to trick them into visiting a link that points to a shared Spark cluster and submits data in