Hi Thommy, {quote} Because I don't see how this is ever going to work in scenarios where the AM is on a different node than the containers. {quote}
-- I do not quite understand this part. AM essentially is running in a container as well. And the http server is brought up in the same container. {quote} even if we can't get a better address for the AM from YARN, we could at least filter the addresses we get back from the JVM to exclude loopbacks. {quote} -- You are right. InetAddress.getLocalHost() gives back loopback address sometimes. We should filter this out. Just googling one possible solution <http://www.coderanch.com/t/491883/java/java/IP> . + @Yi, @Navina, Also, I think this fix should go to the 0.10.0 release. What do you guys think? Thanks, Fang, Yan yanfang...@gmail.com On Thu, Jul 30, 2015 at 6:39 PM, Yan Fang <yanfang...@gmail.com> wrote: > Just one point to add: > > {quote} > AM gets notified of container status from the RM. > {quote} > > I think this is not 100% correct. AM can communicate with NM through > NMClientAsync > <https://hadoop.apache.org/docs/r2.7.1/api/org/apache/hadoop/yarn/client/api/async/NMClientAsync.html> > to > get container status, though Samza does not implement the CallbackHandler. > > Thanks, > > Fang, Yan > yanfang...@gmail.com > > On Thu, Jul 30, 2015 at 6:06 PM, Navina Ramesh < > nram...@linkedin.com.invalid> wrote: > >> The NM (and hence, by extension the container) heartbeats to the RM, not >> the AM. AM gets notified of container status from the RM. >> The AM starts / stops /releases a container process by communicating to >> the >> NM. >> >> Navina >> >> >> On Thu, Jul 30, 2015 at 5:55 PM, Thomas Becker <tobec...@tivo.com> wrote: >> >> > Ok, I thought there was some communication from the container to the AM, >> > it sounds like you're saying it's in the other direction only? Don't >> > containers heartbeat to the AM? Regardless, even if we can't get a >> better >> > address for the AM from YARN, we could at least filter the addresses we >> get >> > back from the JVM to exclude loopbacks. >> > >> > -Tommy >> > ________________________________________ >> > From: Navina Ramesh [nram...@linkedin.com.INVALID] >> > Sent: Thursday, July 30, 2015 8:40 PM >> > To: dev@samza.apache.org >> > Subject: Re: Coordinator URL always 127.0.0.1 >> > >> > Hi Tommy, >> > Yi is right. Container start is coordinated by the AppMaster using an >> > NMClient. Container host name and port is provided by the RM during >> > allocation. >> > In Yarn (at least, afaik), when the node joins a cluster, the NM >> registers >> > itself with the RM. So, the NM might still be using >> > getLocalhost.getAddress(). >> > >> > I don't know of any other way to programmatically fetch the machine's >> > hostname (apart from some hacky shell commands). >> > >> > Cheers, >> > Navina >> > >> > On Thu, Jul 30, 2015 at 5:23 PM, Yi Pan <nickpa...@gmail.com> wrote: >> > >> > > Hi, Tommy, >> > > >> > > Yeah, I agree that the current implementation is not bullet-proof to >> any >> > > different networking configuration on the host. As for the AM <-> >> > container >> > > communication, if I am not mistaken, it is through the NMClient and >> the >> > > node HTTP address is wrapped within the Container object returned from >> > RM. >> > > I am not very familiar with that part of source code. Navina may be >> able >> > to >> > > help more here. >> > > >> > > -Yi >> > > >> > > On Thu, Jul 30, 2015 at 4:27 PM, Thomas Becker <tobec...@tivo.com> >> > wrote: >> > > >> > > > Hi Yi, >> > > > Thanks a lot for your reply. I don't doubt we can get it to work by >> > > > mucking with the networking configuration, but to me this feels >> like a >> > > > workaround, not a solution. >> > InetAddress.getLocalHost().getHostAddress() >> > > is >> > > > not a reliable way of obtaining an IP that other machines can >> connect >> > to. >> > > > Just today I tested on several Linux distros and it did not work on >> any >> > > of >> > > > them. Can we do something more robust here? How does the container >> > > > communicate status to the AM? >> > > > >> > > > -Tommy >> > > > >> > > > ________________________________________ >> > > > From: Yi Pan [nickpa...@gmail.com] >> > > > Sent: Thursday, July 30, 2015 6:48 PM >> > > > To: dev@samza.apache.org >> > > > Subject: Re: Coordinator URL always 127.0.0.1 >> > > > >> > > > Hi, Tommy, >> > > > >> > > > I think that it might be a commonly asked question regarding to >> > multiple >> > > > IPs on a single host. A common trick w/o changing code is (copied >> from >> > > SO: >> > > > >> > > > >> > > >> > >> http://stackoverflow.com/questions/2381316/java-inetaddress-getlocalhost-returns-127-0-0-1-how-to-get-real-ip >> > > > ) >> > > > >> > > > {code} >> > > > >> > > > 1. >> > > > >> > > > Find your host name. Type: hostname. For example, you find your >> > > hostname >> > > > is mycomputer.xzy.com >> > > > 2. >> > > > >> > > > Put your host name in your hosts file. /etc/hosts . Such as >> > > > >> > > > 10.50.16.136 mycomputer.xzy.com >> > > > >> > > > >> > > > {code} >> > > > >> > > > -Yi >> > > > >> > > > On Thu, Jul 30, 2015 at 11:35 AM, Tommy Becker <tobec...@tivo.com> >> > > wrote: >> > > > >> > > > > We are testing some jobs on a YARN grid and noticed they are often >> > not >> > > > > starting up properly due to being unable to connect to the job >> > > > coordinator. >> > > > > After some investigation it seems as if the jobs are always >> getting a >> > > > > coordinator URL of http://127.0.0.1:<port> But my understanding >> is >> > > that >> > > > > the coordinator runs only in the AM, so I'd expect these URLs to >> more >> > > > often >> > > > > than not be to some other machine. Looking at the code however, >> I'm >> > > not >> > > > > sure how that would ever happen since the URL for the coordinator >> > > always >> > > > > comes from InetAddress.getLocalHost().getHostAddress() in >> > > > > org.apache.samza.coordinator.server.HttpServer#getUrl >> > > > > >> > > > > Am I off base here? Because I don't see how this is ever going to >> > work >> > > > in >> > > > > scenarios where the AM is on a different node than the containers. >> > > > > >> > > > > -- >> > > > > Tommy Becker >> > > > > Senior Software Engineer >> > > > > >> > > > > Digitalsmiths >> > > > > A TiVo Company >> > > > > >> > > > > www.digitalsmiths.com<http://www.digitalsmiths.com> >> > > > > tobec...@tivo.com<mailto:tobec...@tivo.com> >> > > > > >> > > > > ________________________________ >> > > > > >> > > > > This email and any attachments may contain confidential and >> > privileged >> > > > > material for the sole use of the intended recipient. Any review, >> > > copying, >> > > > > or distribution of this email (or any attachments) by others is >> > > > prohibited. >> > > > > If you are not the intended recipient, please contact the sender >> > > > > immediately and permanently delete this email and any >> attachments. No >> > > > > employee or agent of TiVo Inc. is authorized to conclude any >> binding >> > > > > agreement on behalf of TiVo Inc. by email. Binding agreements with >> > TiVo >> > > > > Inc. may only be made by a signed written agreement. >> > > > > >> > > > >> > > > ________________________________ >> > > > >> > > > This email and any attachments may contain confidential and >> privileged >> > > > material for the sole use of the intended recipient. Any review, >> > copying, >> > > > or distribution of this email (or any attachments) by others is >> > > prohibited. >> > > > If you are not the intended recipient, please contact the sender >> > > > immediately and permanently delete this email and any attachments. >> No >> > > > employee or agent of TiVo Inc. is authorized to conclude any binding >> > > > agreement on behalf of TiVo Inc. by email. Binding agreements with >> TiVo >> > > > Inc. may only be made by a signed written agreement. >> > > > >> > > >> > >> > >> > >> > -- >> > Navina R. >> > >> > ________________________________ >> > >> > This email and any attachments may contain confidential and privileged >> > material for the sole use of the intended recipient. Any review, >> copying, >> > or distribution of this email (or any attachments) by others is >> prohibited. >> > If you are not the intended recipient, please contact the sender >> > immediately and permanently delete this email and any attachments. No >> > employee or agent of TiVo Inc. is authorized to conclude any binding >> > agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo >> > Inc. may only be made by a signed written agreement. >> > >> >> >> >> -- >> Navina R. >> > >