The first issue he runs into is one I also find frustrating -- with cloud providers pushing SSDs, you have to use a pretty large instance type to get a reasonable test setup. I'm not sure if he couldn't launch an older type like m1.large (I think some newer AWS accounts aren't able to) or if he just didn't see it as an option since they are hidden by default. Even the largest general purpose instance types are pretty wimpy wrt storage, only 80GB local instance storage.
The hostname issues are a well known pain point and unfortunately there aren't any great solutions that aren't EC2-specific. Here's a quick run down: * None of the images for popular distros on EC2 will auto-set the hostname beyond what EC2 already sets up (which isn't publicly routable). The following details might explain why they can't. For example, a recent Ubuntu image gives: ubuntu@ip-172-30-2-76:~$ hostname ip-172-30-2-76 ubuntu@ip-172-30-2-76:~$ cat /etc/hosts 127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts ::1 ip6-localhost ip6-loopback --- cut irrelevant pieces --- * Sometimes the hostname is set, but isn't useful. For example, in this Ubuntu image, the hostname is set to "ip-[ip-address-]", but that isn't routable, so generates really irritating behavior. Running on the server itself (which is running in a VPC, see below for more details): scala> InetAddress.getLocalHost java.net.UnknownHostException: ip-172-30-2-76: ip-172-30-2-76: Name or service not known at java.net.InetAddress.getLocalHost(InetAddress.java:1473) at .<init>(<console>:9) at .<clinit>(<console>) at .<init>(<console>:11) at .<clinit>(<console>) at $print(<console>) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:704) at scala.tools.nsc.interpreter.IMain$Request$$anonfun$14.apply(IMain.scala:920) at scala.tools.nsc.interpreter.Line$$anonfun$1.apply$mcV$sp(Line.scala:43) at scala.tools.nsc.io.package$$anon$2.run(package.scala:25) at java.lang.Thread.run(Thread.java:745) Caused by: java.net.UnknownHostException: ip-172-30-2-76: Name or service not known at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:901) at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1293) at java.net.InetAddress.getLocalHost(InetAddress.java:1469) ... 14 more * As described in a bunch of places, the only reliable way to get public DNS info is through EC2's own instance metadata API: https://forums.aws.amazon.com/thread.jspa?threadID=77788 For example: curl -s http://169.254.169.254/latest/meta-data/public-hostname might give something like: ec2-203-0-113-25.compute-1.amazonaws.com * But you may not even *have* a public DNS hostname. If you launch in a VPC, you'll only get one if you set the VPC to generate them (and I'm pretty sure the default is to not create them): http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpc-dns.html The output of the curl call above will just be empty. * AWS is pretty aggressively trying to move away from EC2-Classic (i.e. non-VPC instances), so most new instances will end up in VPCs unless you are working in a grandfathered account + AZ. If VPC without public DNS is the default, we'll have to carefully guide new users in generating a setup that works properly if we try to use hostnames. * Even if you try moving the IP addresses, you still have to deal with VPCs. You can't directly get your public IP address without accessing something outside the host since you're in a VPC. You need to use the instance metadata API to look it up, i.e., curl -s http://169.254.169.254/latest/meta-data/public-ipv4 * And yet another problem with IPs: unless you use an elastic IP, you're not guaranteed they'll be stable: Auto-assign Public IP Requests a public IP address from Amazon's public IP address pool, to make your instance reachable from the Internet. In most cases, the public IP address is associated with the instance until it’s stopped or terminated, after which it’s no longer available for you to use. If you require a persistent public IP address that you can associate and disassociate at will, use an Elastic IP address (EIP) instead. You can allocate your own EIP, and associate it to your instance after launch. I know Spark had some similar issues -- using their (very convenient!) ec2 script, you still ended up with some stuff in their web interface that linked to internal addresses such that the links wouldn't work. I'm not sure if they have figured out a decent work around. But as you can see from the above, it's unlikely you can use generic approaches to get the info we need -- it'll need to be platform specific, which probably means it's better to determine it outside the main Kafka code and provide it via advertised.host.name. -Ewen On Fri, Oct 17, 2014, at 05:11 PM, Gwen Shapira wrote: > Basically, the issue (or at least one of very many possible network > issues...) is that the server has "localhost" hardcoded as its > canonical name in /etc/hosts: > > [root@Billc-cent70x64 ~]# cat /etc/hosts > 127.0.0.1 localhost localhost.localdomain localhost4 > localhost4.localdomain4 Billc-cent70x64 > ::1 localhost localhost.localdomain localhost6 > localhost6.localdomain6 > > Unfortunately a very common default for RedHat and Centos machines. > > As the blog mentions, a good solution (other than instructing Kafka on > the right name to advertise) is to add the correct IP and hostname to > /etc/hosts. We may want to add this option to the FAQ. > > Gwen > > > > > On Fri, Oct 17, 2014 at 7:56 PM, Gwen Shapira <gshap...@cloudera.com> > wrote: > > It looks like we are using canonical hostname: > > > > def register() { > > val advertisedHostName = > > if(advertisedHost == null || advertisedHost.trim.isEmpty) > > InetAddress.getLocalHost.getCanonicalHostName > > else > > advertisedHost > > val jmxPort = > > System.getProperty("com.sun.management.jmxremote.port", "-1").toInt > > ZkUtils.registerBrokerInZk(zkClient, brokerId, advertisedHostName, > > advertisedPort, zkSessionTimeoutMs, jmxPort) > > } > > > > So never mind :) > > > > > > On Fri, Oct 17, 2014 at 6:36 PM, Jay Kreps <jay.kr...@gmail.com> wrote: > >> Hmm, yes, actually I don't think I actually understand the issue. Basically > >> as I understand it we do InetAddress.getLocalHost.getHostAddress which on > >> AWS picks the wrong hostname/ip and then the producer can't connect. People > >> eventually find this FAQ, but I was hoping there was a more automatic way > >> since everyone is on AWS these days. Maybe getCanonicalHostName would fix > >> it? > >> > >> https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whycan'tmyconsumers/producersconnecttothebrokers > >> ? > >> > >> -Jay > >> > >> On Fri, Oct 17, 2014 at 3:19 PM, Gwen Shapira <gshap...@cloudera.com> > >> wrote: > >> > >>> In #2, do you refer to advertising the "internal" hostname instead of > >>> the external one? > >>> In this case, will it be enough to use getCanonicalHostName (which > >>> uses a name service)? > >>> > >>> Note that I think the problem the blog reported (wrong name > >>> advertised) is somewhat orthogonal to the question of which interface > >>> we bind to (which should probably be the default interface). > >>> > >>> Gwen > >>> > >>> On Fri, Oct 17, 2014 at 5:28 PM, Jay Kreps <jay.kr...@gmail.com> wrote: > >>> > This guy documented a few struggles getting going with Kafka. Not sure > >>> > if > >>> > there is anything we can do to make it better? > >>> > http://ispyker.blogspot.com/2014/10/kafka-part-1.html > >>> > > >>> > 1. Would be great to figure out the apache/gradle thing. > >>> > 2. The problem of having Kafka advertise localhost on AWS is really > >>> common. > >>> > I was thinking one possible solution for this would be to get all the > >>> > interfaces and prefer non-localhost interfaces if they exist. > >>> > > >>> > -Jay > >>>