I figured out this issue (in our case) ...And I'll vent a little in my reply here... =:)Fedora's well-intentioned firewall (firewall-cmd) requires you to open (enable) any port/services on a host that you need to connect to (including SSH/22 - which is enabled by default, of course). So when launching client applications that use ephemeral ports to connect back to (as a Spark App does for remote YARN ResourceManager/NodeManagers to connect back to), you can't know what that port will be to enable it, unless the application allows you to specify that as a launch property (which you can for Spark Apps via -- -Dspark.driver.port="NNNNN").Again, well intentioned, but always a pain.So... you have to either disable the firewall capability in Fedora; or you open/enable a range of ports and tell your applications to use one of those.Also note that as of this writing, firewall-cmd's ability to port-forwarding from the HOST to GUESTS in Libvirt/KVM-based Hadoop/YARN/HDFS test/dev clusters, doesn't work (it never has -- it's on the TODO list). It's another capability that you'll need in order to reach daemon ports running *inside* the KVM cluster (for example, UI ports). The work-around here (besides, again, disabling the Fedora Firewall altogether) is to use same-subnet BRIDGING (not NAT-ting). Doing that will eliminate the need for port-forawrding (which again doesn't work). I've filed bugs in the past for this.So that is why YARN applications weren't terminating correctly for Spark Aps, or for that matter working at all since it uses ephemeral ports (by necessity).So whatever the port your Spark application uses, remember to issue the command:use@driverHost$ sudo firewall-cmd --zone=public --add-port=/SparkAppPort//tcpor, better yet, use a port-deterministic strategy mentioned earlier.(Hopefully the verbosity here will help someone in their furute search. Fedora aside, the original problem here can be network related, as I discovered).sincerely,didata
-- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-processes-not-doing-on-killing-corresponding-YARN-application-tp13443p13819.html Sent from the Apache Spark User List mailing list archive at Nabble.com.