Hi Jacob, The port configuration docs that we worked on together are now available at: http://spark.apache.org/docs/latest/spark-standalone.html#configuring-ports-for-network-security
Thanks for the help! Andrew On Wed, May 28, 2014 at 3:21 PM, Jacob Eisinger <jeis...@us.ibm.com> wrote: > Howdy Andrew, > > This is a standalone cluster. And, yes, if my understanding of Spark > terminology is correct, you are correct about the port ownerships. > > > Jacob > > Jacob D. Eisinger > IBM Emerging Technologies > jeis...@us.ibm.com - (512) 286-6075 > > [image: Inactive hide details for Andrew Ash ---05/28/2014 05:18:46 > PM---Hmm, those do look like 4 listening ports to me. PID 3404 is]Andrew > Ash ---05/28/2014 05:18:46 PM---Hmm, those do look like 4 listening ports > to me. PID 3404 is an executor and PID 4762 is a worker? > > > From: Andrew Ash <and...@andrewash.com> > To: user@spark.apache.org > Date: 05/28/2014 05:18 PM > > Subject: Re: Comprehensive Port Configuration reference? > ------------------------------ > > > > Hmm, those do look like 4 listening ports to me. PID 3404 is an executor > and PID 4762 is a worker? This is a standalone cluster? > > > On Wed, May 28, 2014 at 8:22 AM, Jacob Eisinger <*jeis...@us.ibm.com* > <jeis...@us.ibm.com>> wrote: > > Howdy Andrew, > > Here is what I ran before an application context was created (other > services have been deleted): > > *# netstat -l -t tcp -p --numeric-ports > * > Active Internet connections (only servers) > > Proto Recv-Q Send-Q Local Address Foreign Address > State PID/Program name > * tcp6 0 0 **10.90.17.100:8888* <http://10.90.17.100:8888/> > * :::* LISTEN 4762/java > tcp6 0 0 :::8081 :::* > LISTEN 4762/java * > > And, then while the application context is up: > *# netstat -l -t tcp -p --numeric-ports > * > Active Internet connections (only servers) > > Proto Recv-Q Send-Q Local Address Foreign Address > State PID/Program name > * tcp6 0 0 **10.90.17.100:8888* <http://10.90.17.100:8888/>* > :::* LISTEN 4762/java > * > > * tcp6 0 0 :::57286 :::* > LISTEN 3404/java tcp6 0 > 0 * > *10.90.17.100:38118* <http://10.90.17.100:38118/> > * :::* LISTEN 3404/java > tcp6 0 0 **10.90.17.100:35530* > <http://10.90.17.100:35530/> > * :::* LISTEN 3404/java > tcp6 0 0 :::60235 :::* > LISTEN 3404/java * > * tcp6 0 0 :::8081 :::* > LISTEN 4762/java * > > My understanding is that this says four ports are open. Is 57286 and > 60235 not being used? > > > Jacob > > Jacob D. Eisinger > IBM Emerging Technologies > *jeis...@us.ibm.com* <jeis...@us.ibm.com> - *(512) 286-6075* > <%28512%29%20286-6075> > > [image: Inactive hide details for Andrew Ash ---05/25/2014 06:25:18 > PM---Hi Jacob, The config option spark.history.ui.port is new for 1]Andrew > Ash ---05/25/2014 06:25:18 PM---Hi Jacob, The config option > spark.history.ui.port is new for 1.0 The problem that > > > From: Andrew Ash <*and...@andrewash.com* <and...@andrewash.com>> > To: *user@spark.apache.org* <user@spark.apache.org> > Date: 05/25/2014 06:25 PM > > Subject: Re: Comprehensive Port Configuration reference? > ------------------------------ > > > > Hi Jacob, > > The config option spark.history.ui.port is new for 1.0 The problem > that History server solves is that in non-Standalone cluster deployment > modes (Mesos and YARN) there is no long-lived Spark Master that can store > logs and statistics about an application after it finishes. History server > is the UI that renders logged data from applications after they complete. > > Read more here: *https://issues.apache.org/jira/browse/SPARK-1276* > <https://issues.apache.org/jira/browse/SPARK-1276> and > *https://github.com/apache/spark/pull/204* > <https://github.com/apache/spark/pull/204> > > As far as the two vs four dynamic ports, are those all listening > ports? I did observe 4 ports in use, but only two of them were listening. > The other two were the random ports used for responses on outbound > connections, the source port of the (srcIP, srcPort, dstIP, dstPort) tuple > that uniquely identifies a TCP socket. > > > > *http://unix.stackexchange.com/questions/75011/how-does-the-server-find-out-what-client-port-to-send-to* > > <http://unix.stackexchange.com/questions/75011/how-does-the-server-find-out-what-client-port-to-send-to> > > Thanks for taking a look through! > > I also realized that I had a couple mistakes with the 0.9 to 1.0 > transition so appropriately documented those now as well in the updated PR. > > Cheers! > Andrew > > > > On Fri, May 23, 2014 at 2:43 PM, Jacob Eisinger <*jeis...@us.ibm.com* > <jeis...@us.ibm.com>> wrote: > Howdy Andrew, > > I noticed you have a configuration item that we were not aware of: > spark.history.ui.port . Is that new for 1.0? > > Also, we noticed that the Workers and the Drivers were opening up > four dynamic ports per application context. It looks like you were > seeing > two. > > Everything else looks like it aligns! > Jacob > > > > Jacob D. Eisinger > IBM Emerging Technologies > *jeis...@us.ibm.com* <jeis...@us.ibm.com> - *(512) 286-6075* > <%28512%29%20286-6075> > > [image: Inactive hide details for Andrew Ash ---05/23/2014 10:30:58 > AM---Hi everyone, I've also been interested in better > understanding]Andrew > Ash ---05/23/2014 10:30:58 AM---Hi everyone, I've also been interested > in > better understanding what ports are used where > > From: Andrew Ash <*and...@andrewash.com* <and...@andrewash.com>> > To: *user@spark.apache.org* <user@spark.apache.org> > Date: 05/23/2014 10:30 AM > Subject: Re: Comprehensive Port Configuration reference? > > ------------------------------ > > > > Hi everyone, > > I've also been interested in better understanding what ports are > used where and the direction the network connections go. I've observed > a > running cluster and read through code, and came up with the below > documentation addition. > > *https://github.com/apache/spark/pull/856* > <https://github.com/apache/spark/pull/856> > > Scott and Jacob -- it sounds like you two have pulled together some > of this yourselves for writing firewall rules. Would you mind taking a > look at this pull request and confirming that it matches your > observations? > Wrong documentation is worse than no documentation, so I'd like to make > sure this is right. > > Cheers, > Andrew > > > On Wed, May 7, 2014 at 10:19 AM, Mark Baker <*dist...@acm.org* > <dist...@acm.org>> wrote: > On Tue, May 6, 2014 at 9:09 AM, Jacob Eisinger < > *jeis...@us.ibm.com* <jeis...@us.ibm.com>> wrote: > > In a nut shell, Spark opens up a couple of well known ports. > And,then the workers and the shell open up dynamic ports for each > job. > These dynamic ports make securing the Spark network difficult. > > Indeed. > > Judging by the frequency with which this topic arises, this is a > concern for many (myself included). > > I couldn't find anything in JIRA about it, but I'm curious to > know > whether the Spark team considers this a problem in need of a fix? > > Mark. > > > >