michaeljmarshall commented on issue #15401:
URL: https://github.com/apache/pulsar/issues/15401#issuecomment-1177120312
I didn't realize you could start pulsar that way, thanks for sharing!
Your issue helped me solve a question I have been wondering about for a
while (thank you!).
I think I found the likely root cause of your issue, but you will need to
confirm. If I am correct, here is the TLDR: Pulsar discovers its IP address for
things like `advertisedAddress` using a lookup of the hostname, and if your
machine's hostname is not defined in the `/ect/hosts` file, then (based on my
troubleshooting) the JVM will perform a DNS lookup on the address and issues
can arise when the resulting IP does not route to the right Pulsar process.
Based on your logs, I noticed that some of your stack traces indicate trying
to connect to `192.168.1.2:4181`:
```
2022-06-02T13:29:02,974-0400 [client-scheduler-OrderedScheduler-28-0] ERROR
org.apache.bookkeeper.clients.impl.internal.RootRangeClientImplWithRetries -
Reason for the failure
io.grpc.StatusRuntimeException: UNAVAILABLE: io exception
at io.grpc.Status.asRuntimeException(Status.java:535)
~[io.grpc-grpc-api-1.45.1.jar:1.45.1]
at
io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:533)
~[io.grpc-grpc-stub-1.45.1.jar:1.45.1]
at
io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:562)
~[io.grpc-grpc-core-1.45.1.jar:1.45.1]
at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:70)
~[io.grpc-grpc-core-1.45.1.jar:1.45.1]
at
io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:743)
~[io.grpc-grpc-core-1.45.1.jar:1.45.1]
at
io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:722)
~[io.grpc-grpc-core-1.45.1.jar:1.45.1]
at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
~[io.grpc-grpc-core-1.45.1.jar:1.45.1]
at
io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
~[io.grpc-grpc-core-1.45.1.jar:1.45.1]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
~[?:?]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
~[?:?]
at java.lang.Thread.run(Thread.java:833) [?:?]
Caused by:
io.grpc.netty.shaded.io.netty.channel.AbstractChannel$AnnotatedConnectException:
Connection refused: /192.168.1.2:4181
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.Net.pollConnect(Native Method) ~[?:?]
at sun.nio.ch.Net.pollConnectNow(Net.java:672) ~[?:?]
at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:946) ~[?:?]
at
io.grpc.netty.shaded.io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)
~[io.grpc-grpc-netty-shaded-1.45.1.jar:1.45.1]
at
io.grpc.netty.shaded.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)
~[io.grpc-grpc-netty-shaded-1.45.1.jar:1.45.1]
at
io.grpc.netty.shaded.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:710)
~[io.grpc-grpc-netty-shaded-1.45.1.jar:1.45.1]
at
io.grpc.netty.shaded.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:658)
~[io.grpc-grpc-netty-shaded-1.45.1.jar:1.45.1]
at
io.grpc.netty.shaded.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:584)
~[io.grpc-grpc-netty-shaded-1.45.1.jar:1.45.1]
at
io.grpc.netty.shaded.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496)
~[io.grpc-grpc-netty-shaded-1.45.1.jar:1.45.1]
at
io.grpc.netty.shaded.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
~[io.grpc-grpc-netty-shaded-1.45.1.jar:1.45.1]
at
io.grpc.netty.shaded.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
~[io.grpc-grpc-netty-shaded-1.45.1.jar:1.45.1]
at
io.grpc.netty.shaded.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
~[io.grpc-grpc-netty-shaded-1.45.1.jar:1.45.1]
... 1 more
```
Port 4181 is a bookkeeper port, and for some reason the client is trying to
connect to a non-local host, but we already know that the bookkeeper should be
running on localhost. I see the same behavior on my host machine:
```
Caused by:
io.grpc.netty.shaded.io.netty.channel.AbstractChannel$AnnotatedConnectException:
Connection refused: /23.202.231.169:4181
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.Net.pollConnect(Native Method) ~[?:?]
at sun.nio.ch.Net.pollConnectNow(Net.java:672) ~[?:?]
at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:946) ~[?:?]
at
io.grpc.netty.shaded.io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)
~[io.grpc-grpc-netty-shaded-1.45.1.jar:1.45.1]
at
io.grpc.netty.shaded.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)
~[io.grpc-grpc-netty-shaded-1.45.1.jar:1.45.1]
at
io.grpc.netty.shaded.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:710)
~[io.grpc-grpc-netty-shaded-1.45.1.jar:1.45.1]
at
io.grpc.netty.shaded.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:658)
~[io.grpc-grpc-netty-shaded-1.45.1.jar:1.45.1]
at
io.grpc.netty.shaded.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:584)
~[io.grpc-grpc-netty-shaded-1.45.1.jar:1.45.1]
at
io.grpc.netty.shaded.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496)
~[io.grpc-grpc-netty-shaded-1.45.1.jar:1.45.1]
at
io.grpc.netty.shaded.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
~[io.grpc-grpc-netty-shaded-1.45.1.jar:1.45.1]
at
io.grpc.netty.shaded.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
~[io.grpc-grpc-netty-shaded-1.45.1.jar:1.45.1]
at
io.grpc.netty.shaded.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
~[io.grpc-grpc-netty-shaded-1.45.1.jar:1.45.1]
... 1 more
```
Interestingly, in my case, Pulsar fails when I am off my work VPN, but it
works when I am on the VPN. I've see the same behavior when running Pulsar
Standalone, which makes sense since the brew installation is Pulsar Standalone.
The nuance comes here:
https://github.com/apache/pulsar/blob/76646cfd9496820a6a6dbee225067045235838c0/pulsar-broker-common/src/main/java/org/apache/pulsar/broker/ServiceConfigurationUtils.java#L43-L51.
If Pulsar does not have its `advertisedAddress` configured, it performs this
lookup, and that lookup relies on a reverse lookup of your hostname, as
described here https://stackoverflow.com/a/18488982. After updating my
`/etc/hosts` file to map my hostname to map my hostname to `127.0.0.1`,
standalone now works and I've confirmed that 2.10.1 works after putting my Mac
to sleep and waking it up again. (I only just realized tonight that Standalone
was not previously working because my hostname was not correctly resolving to
`127.0.0.1`.)
I also noticed that running `nslookup` on my host name resulted in
`23.202.231.169` before updating my `/etc/hosts` file and when off the VPN.
Let me know if this helps.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]