We are testing some jobs on a YARN grid and noticed they are often not starting up properly due to being unable to connect to the job coordinator. After some investigation it seems as if the jobs are always getting a coordinator URL of http://127.0.0.1:<port> But my understanding is that the coordinator runs only in the AM, so I'd expect these URLs to more often than not be to some other machine. Looking at the code however, I'm not sure how that would ever happen since the URL for the coordinator always comes from InetAddress.getLocalHost().getHostAddress() in org.apache.samza.coordinator.server.HttpServer#getUrl
Am I off base here? Because I don't see how this is ever going to work in scenarios where the AM is on a different node than the containers. -- Tommy Becker Senior Software Engineer Digitalsmiths A TiVo Company www.digitalsmiths.com<http://www.digitalsmiths.com> tobec...@tivo.com<mailto:tobec...@tivo.com> ________________________________ This email and any attachments may contain confidential and privileged material for the sole use of the intended recipient. Any review, copying, or distribution of this email (or any attachments) by others is prohibited. If you are not the intended recipient, please contact the sender immediately and permanently delete this email and any attachments. No employee or agent of TiVo Inc. is authorized to conclude any binding agreement on behalf of TiVo Inc. by email. Binding agreements with TiVo Inc. may only be made by a signed written agreement.