Slack digest for #general - 2020-06-14

Apache Pulsar Slack Sun, 14 Jun 2020 02:11:33 -0700

2020-06-13 12:32:58 UTC - megachucky: @megachucky has joined the channel
----
2020-06-13 13:01:32 UTC - Sankararao Routhu: <!here> do we need to restart 
proxies after restarting brokers? My publisher connections are failing through 
proxy when brokers are restarted unless I restart my proxies
----
2020-06-13 17:01:51 UTC - Asaf Mesika: Can’t you get lag count per topic?
----
2020-06-13 17:03:34 UTC - Gilles Barbier: lag count?
----
2020-06-13 17:04:37 UTC - Asaf Mesika: In Kafka for example per consumer you 
can get the amount of records which haven’t been consumed yet 
----
2020-06-13 17:05:01 UTC - Asaf Mesika: I presumed it is the same in Pular
----
2020-06-13 17:07:03 UTC - Gilles Barbier: Not sure about that - but we obtain 
counters for each job status using pulsar functions and counters 
(<https://pulsar.apache.org/docs/en/2.5.2/functions-develop/#api>)
----
2020-06-13 17:19:50 UTC - Asaf Mesika: But pulsar functions have different 
subscription and rate of consumption compared to job workers no?
----
2020-06-13 17:28:57 UTC - Gilles Barbier: In our case, , a "dispatchJob" 
message is handled by a function before to be sent to workers. This function 
maintains a state describing each job processing. Workers are sending back a 
status to this function also. It's more than a simple task queue
----
2020-06-13 18:48:53 UTC - Rutvij: @Rutvij has joined the channel
----
2020-06-13 20:38:37 UTC - Marcio Martins: Anyone knows what would cause this?
```20:28:26.093 [BookKeeperClientScheduler-OrderedScheduler-0-0] ERROR 
org.apache.bookkeeper.client.TopologyAwareEnsemblePlacementPolicy - Unexpected 
exception while handling joining bookie 
bookie-1.bookie.pulsar.svc.cluster.local:3181
java.lang.NullPointerException: null
        at 
org.apache.bookkeeper.net.NetUtils.resolveNetworkLocation(NetUtils.java:77) 
~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
        at 
org.apache.bookkeeper.client.TopologyAwareEnsemblePlacementPolicy.resolveNetworkLocation(TopologyAwareEnsemblePlacementPolicy.java:779)
 ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
        at 
org.apache.bookkeeper.client.TopologyAwareEnsemblePlacementPolicy.createBookieNode(TopologyAwareEnsemblePlacementPolicy.java:775)
 ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
        at 
org.apache.bookkeeper.client.TopologyAwareEnsemblePlacementPolicy.handleBookiesThatJoined(TopologyAwareEnsemblePlacementPolicy.java:707)
 ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
        at 
org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.handleBookiesThatJoined(RackawareEnsemblePlacementPolicyImpl.java:79)
 ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
        at 
org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicy.handleBookiesThatJoined(RackawareEnsemblePlacementPolicy.java:246)
 ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
        at 
org.apache.bookkeeper.client.TopologyAwareEnsemblePlacementPolicy.onClusterChanged(TopologyAwareEnsemblePlacementPolicy.java:654)
 ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
        at 
org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicyImpl.onClusterChanged(RackawareEnsemblePlacementPolicyImpl.java:79)
 ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
        at 
org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicy.onClusterChanged(RackawareEnsemblePlacementPolicy.java:89)
 ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
        at 
org.apache.bookkeeper.client.BookieWatcherImpl.processWritableBookiesChanged(BookieWatcherImpl.java:171)
 ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
        at 
org.apache.bookkeeper.client.BookieWatcherImpl.lambda$initialBlockingBookieRead$1(BookieWatcherImpl.java:206)
 ~[org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
        at 
org.apache.bookkeeper.discover.ZKRegistrationClient$WatchTask.accept(ZKRegistrationClient.java:139)
 [org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
        at 
org.apache.bookkeeper.discover.ZKRegistrationClient$WatchTask.accept(ZKRegistrationClient.java:62)
 [org.apache.bookkeeper-bookkeeper-server-4.10.0.jar:4.10.0]
        at 
java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
 [?:1.8.0_242]
        at 
java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
 [?:1.8.0_242]
        at 
java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:456)
 [?:1.8.0_242]
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[?:1.8.0_242]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
[?:1.8.0_242]
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
 [?:1.8.0_242]
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
 [?:1.8.0_242]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_242]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_242]
        at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
 [io.netty-netty-common-4.1.45.Final.jar:4.1.45.Final]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242]```


----
2020-06-13 20:39:32 UTC - Marcio Martins: Before this, I had official zookeeper 
and bookeeper clusters running with no issues - I switched to the `pulsar-all` 
images and am now getting this...
----
2020-06-13 20:49:23 UTC - Anup Ghatage: @Marcio Martins
Looks like we’re failing to resolve the hostname for the BookieSocketAddress of 
`bookie-1.bookie.pulsar.svc.cluster.local:3181`
If we dig deeper, we see that this `null` address is coming from:
```TopologyAwareEnsemblePlacementPolicy.java
L:649 joinedBookies = Sets.difference(writableBookies, 
oldBookieSet).immutableCopy();```
Guavas Sets.difference only returns items which are present in set1 but not in 
set2 and if you’re getting null, it means both sets are the same.

You might just have all of the bookies in read only mode and none in read-write 
mode.

If you have a moment we can side-bar and I can show you how to check if that’s 
the case.

Still NPE is something that is *not* expected. @Sijie Guo do you recommend we 
open a bug on this one?
----
2020-06-13 20:53:01 UTC - Marcio Martins: Yes, any help would be great!
----
2020-06-13 20:53:13 UTC - Marcio Martins: Thank you!
----
2020-06-13 20:53:37 UTC - Anup Ghatage: Can you please log into any bookie node 
and execute this shell command:

`bin/bookkeeper shell listbookies -h -rw`
----
2020-06-13 20:54:52 UTC - Marcio Martins: I get Fail to process command 'list'
----
2020-06-13 20:55:12 UTC - Marcio Martins: ```20:54:40.880 [main-EventThread] 
INFO  org.apache.bookkeeper.zookeeper.ZooKeeperWatcherBase - ZooKeeper client 
is connected now.
ReadWrite Bookies :
20:54:41.022 [main-EventThread] INFO  org.apache.zookeeper.ClientCnxn - 
EventThread shut down for session: 0x300015d235e000b```

----
2020-06-13 20:55:41 UTC - Marcio Martins: Seems like there are no read-write 
bookies
----
2020-06-13 20:55:41 UTC - Anup Ghatage: And now try this:
`bin/bookkeeper shell listbookies -h -ro`
----
2020-06-13 20:55:58 UTC - Marcio Martins: No bookie exists!
----
2020-06-13 20:56:15 UTC - Anup Ghatage: Hmm just as I expected. The BookKeeper 
deployment has gone wrong
----
2020-06-13 20:56:41 UTC - Marcio Martins: All bookies are connected to zookeeper
----
2020-06-13 20:57:03 UTC - Marcio Martins: and I initialized the cluster with 
`initnewcluster`
----
2020-06-13 20:57:14 UTC - Marcio Martins: I think on the old deployment script 
I was using `shell metaformat`
----
2020-06-13 20:57:18 UTC - Marcio Martins: Could that be it?
----
2020-06-13 20:57:20 UTC - Anup Ghatage: So just to be clear, you are seeing 
nothing in read-write or read-only bookies
----
2020-06-13 20:57:31 UTC - Marcio Martins: yes, they are both empty
----
2020-06-13 20:57:51 UTC - Anup Ghatage: &gt;  I think on the old deployment 
script I was using `shell metaformat`
Not sure what the old deployment was doing.
----
2020-06-13 21:00:05 UTC - Marcio Martins: So this is the cluster initialization 
before the bookies were started for the first time:
```20:47:06.286 [main] INFO  
org.apache.bookkeeper.discover.ZKRegistrationManager - Successfully initiated 
cluster. ZKServers: 
zookeeper-0.zookeeper.pulsar.svc.cluster.local:2181,zookeeper-1.zookeeper.pulsar.svc.cluster.local:2181,zookeeper-2.zookeeper.pulsar.svc.cluster.local:2181
 ledger root path: /ledgers instanceId: d522dd92-aea6-4d29-a3b2-2101b0ef754d
20:47:06.390 [main-EventThread] INFO  org.apache.zookeeper.ClientCnxn - 
EventThread shut down for session: 0x100015d27a00000
20:47:06.390 [main] INFO  org.apache.zookeeper.ZooKeeper - Session: 
0x100015d27a00000 closed```

----
2020-06-13 21:00:49 UTC - Anup Ghatage: That looks fine
----
2020-06-13 21:01:05 UTC - Anup Ghatage: @Marcio Martins
Perhaps best start a thread on the dev@ mailing list.
If you’ve done everything vanilla and this still happens, might be worth 
updating the documentation for the changes required the way we’re deploying.
----
2020-06-13 21:01:10 UTC - Marcio Martins: The bookies see eachother:
```20:49:34.489 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO  
org.apache.bookkeeper.net.NetworkTopologyImpl - Adding a new node: 
/default-rack/bookie-1.bookie.pulsar.svc.cluster.local:3181
20:49:34.494 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO  
org.apache.bookkeeper.net.NetworkTopologyImpl - Adding a new node: 
/default-rack/bookie-2.bookie.pulsar.svc.cluster.local:3181```

----
2020-06-13 21:02:04 UTC - Marcio Martins: No, I didn't do everything vanilla, I 
adapted the helm charts in the repo, but I am not sure what went wrong, I think 
I got everything correct...
----
2020-06-13 21:46:06 UTC - Anup Ghatage: Can you also try the `listcookies` 
command? Let's check if the bookies are registered at least
----
2020-06-14 02:55:12 UTC - Liam Clarke: Do you have
```brokerDeleteInactiveTopicsEnabled=false```
configured?
----
2020-06-14 02:58:11 UTC - Liam Clarke: Hi all, I'm testing BK's autorecovery, 
and it's not working as I'd assume, and I'd be appreciative of any guidance.

I have 3 bookies running with Docker-compose, and I've configured the namespace 
accordingly:

```bin/pulsar-admin namespaces set-persistence test-tenant/test-namespace \
                            --bookkeeper-ensemble 2 \
                            --bookkeeper-ack-quorum 1 \
                            --bookkeeper-write-quorum 1 \
                            --ml-mark-delete-max-rate 0 ```
I created a topic and fired some data at it, and obtained the ledgerId:

```./pulsar-admin topics info-internal test-tenant/test-namespace/example-topic
{
  "version": 1,
  "creationDate": "2020-06-14T02:45:08.961Z",
  "modificationDate": "2020-06-14T02:45:09.055Z",
  "ledgers": [
    {
      "ledgerId": 42
    }
  ],
  "cursors": {}
}```
I identified the Bookies in the ledger's ensemble:

```./bookkeeper shell ledgermetadata -ledgerid 42
ledgerID: 42
LedgerMetadata{formatVersion=3, ensembleSize=2, writeQuorumSize=1, 
ackQuorumSize=1, state=OPEN, digestType=CRC32C, password=base64:, 
ensembles={0=[172.21.0.5:3181, 172.21.0.4:3181]}, 
customMetadata={component=base64:bWFuYWdlZC1sZWRnZXI=, 
pulsar/managed-ledger=base64:dGVzdC10ZW5hbnQvdGVzdC1uYW1lc3BhY2UvcGVyc2lzdGVudC9leGFtcGxlLXRvcGlj,
 application=base64:cHVsc2Fy}}```
I then `docker-compose kill`  one of the bookies in the ensemble (in this case, 
172.21.0.5).

The autorecovery auditor knows it's underreplicated:

```./bookkeeper shell listunderreplicated
42
        Ctime : 1592103060751```
But the behaviour I'm expecting - that it creates a new replica on the bookie 
not previously part of the ensemble, isn't occurring. In the logs I see this:

```bookie2      | 2020-06-14 03:04:19,401 - ERROR - 
[bookkeeper-io-14-10:PerChannelBookieClient$ConnectionFutureListener@2454] - 
Could not connect to bookie: [id: 0x0bec2643]/172.21.0.4:3181, current state 
CONNECTING : 
...
bookie2      | Caused by: java.net.NoRouteToHostException: No route to host
bookie2      |  ... 11 more
bookie2      | 2020-06-14 03:04:19,402 - ERROR - 
[BookKeeperClientWorker-OrderedExecutor-10-0:ReadLastConfirmedOp@141] - While 
readLastConfirmed ledger: 42 did not hear success responses from all quorums
bookie2      | 2020-06-14 03:04:19,402 - INFO  - 
[ReplicationWorker:ReplicationWorker@290] - BKReadException while rereplicating 
ledger 42. Enough Bookies might not have available So, no harm to continue```

Does the last log message mean that the replica on 172.21.0.4 wasn't up to date 
enough to replicate from?
----
2020-06-14 03:05:00 UTC - Anup Ghatage: Hi @Liam Clarke,
What is the problem you’re seeing exactly?
I assume its that auto recovery is not replicating?
Couple of things you could try here:
• Have you tried with a higher ensemble and quorum numbers? (Try 3,3,3 / 3,2,2)
• Use bookie shell to check the underreplicated ledgers when you’re expecting 
them to replicate
Lets side-bar if you have more questions.
----
2020-06-14 03:06:20 UTC - Liam Clarke: Thanks for the reply @Anup Ghatage I 
accidentally sent the message before putting all the details in 
:slightly_smiling_face: Please see my edit above.
----
2020-06-14 03:06:35 UTC - Anup Ghatage: Sure, going through it right now.
----
2020-06-14 03:07:59 UTC - Liam Clarke: I'm guessing that the lower ack / write 
quorums meant that the data on the remaining ensemble member wasn't up to date 
when I killed the bookie?
----
2020-06-14 03:10:02 UTC - Anup Ghatage: Yeah looks like it.
Can you try with higher quorum numbers?
Try 3/3/3 (E/W/A) assuming you have more booked deployed.
----
2020-06-14 03:16:35 UTC - Anup Ghatage: @Liam Clarke You can also force the 
auto replication to happen via the `triggeraudit`  command
----
2020-06-14 03:29:10 UTC - Liam Clarke: Okay with 3/3/3 I get a `Failure 
NotEnoughBookiesException: Not enough non-faulty bookies available while 
writing entry: 3000 while recovering ledger: 18`

Which makes sense now that I've only got two bookies.

I tried again with 2/2/2, so that after killing 1 of the 3 nodes it still has 
enough to create and ensembled and it's still not working as I expect.

When running `bookkeeper shell recover -l &lt;ledgerid&gt; &lt;bookieid&gt;` It 
tries repeatedly (and fails) to connect to the failed bookie.
----
2020-06-14 03:35:16 UTC - Liam Clarke: Wait, I tell a lie, it's recovered now.
----
2020-06-14 03:35:34 UTC - Anup Ghatage: You owe me a beer 
:stuck_out_tongue_closed_eyes:
----
2020-06-14 03:50:51 UTC - Liam Clarke: Haha, thank you for your help, happily 
buy you one the next time you're in NZ.

Ahhhhh might have been because the ledger was still open, according to the docs:

&gt;  If the replication worker finds a fragment which needs rereplication, but 
does not have a defined endpoint (i.e. the final fragment of a ledger currently 
being written to), it will wait for a grace period before attempting 
rereplication
Prior to it recovering, the ledger was still open, according to the metadata at 
least.
+1 : Anup Ghatage
----
2020-06-14 04:52:28 UTC - Sijie Guo: you don’t need to that. Did you see any 
errors in the proxy log?
----
2020-06-14 05:13:27 UTC - Sankararao Routhu: Hi @Sijie Guo thank for 
replying..I see following error in proxy logs
----
2020-06-14 05:13:29 UTC - Sankararao Routhu: ```2020-06-13 22:12:05,560 -0700 
[pulsar-proxy-io-2-4] INFO  org.apache.pulsar.proxy.server.ProxyConnection - 
[/35.167.191.252:57232] Connection closed
2020-06-13 22:12:05,602 -0700 [pulsar-proxy-io-2-6] WARN  
org.apache.pulsar.proxy.server.LookupProxyHandler - [/103.15.250.25:35321] 
Failed to get next active broker No active broker is available
org.apache.pulsar.broker.PulsarServerException: No active broker is available
        at 
org.apache.pulsar.proxy.server.BrokerDiscoveryProvider.nextBroker(BrokerDiscoveryProvider.java:94)
 ~[org.apache.pulsar-pulsar-proxy-2.5.0.jar:2.5.0]
        at 
org.apache.pulsar.proxy.server.LookupProxyHandler.handleLookup(LookupProxyHandler.java:106)
 [org.apache.pulsar-pulsar-proxy-2.5.0.jar:2.5.0]
        at 
org.apache.pulsar.proxy.server.ProxyConnection.handleLookup(ProxyConnection.java:387)
 [org.apache.pulsar-pulsar-proxy-2.5.0.jar:2.5.0]
        at 
org.apache.pulsar.common.protocol.PulsarDecoder.channelRead(PulsarDecoder.java:126)
 [org.apache.pulsar-pulsar-common-2.5.0.jar:2.5.0]
        at 
org.apache.pulsar.proxy.server.ProxyConnection.channelRead(ProxyConnection.java:174)
 [org.apache.pulsar-pulsar-proxy-2.5.0.jar:2.5.0]```
----
2020-06-14 05:13:55 UTC - Sankararao Routhu: But my broker is active as I 
restarted broker
----
2020-06-14 05:15:30 UTC - Sankararao Routhu: Here is the complete stack trace 
@Sijie Guo
----
2020-06-14 05:15:32 UTC - Sankararao Routhu: ```2020-06-13 22:12:05,560 -0700 
[pulsar-proxy-io-2-4] INFO  org.apache.pulsar.proxy.server.ProxyConnection - 
[/35.167.191.252:57232] Connection closed
2020-06-13 22:12:05,602 -0700 [pulsar-proxy-io-2-6] WARN  
org.apache.pulsar.proxy.server.LookupProxyHandler - [/103.15.250.25:35321] 
Failed to get next active broker No active broker is available
org.apache.pulsar.broker.PulsarServerException: No active broker is available
        at 
org.apache.pulsar.proxy.server.BrokerDiscoveryProvider.nextBroker(BrokerDiscoveryProvider.java:94)
 ~[org.apache.pulsar-pulsar-proxy-2.5.0.jar:2.5.0]
        at 
org.apache.pulsar.proxy.server.LookupProxyHandler.handleLookup(LookupProxyHandler.java:106)
 [org.apache.pulsar-pulsar-proxy-2.5.0.jar:2.5.0]
        at 
org.apache.pulsar.proxy.server.ProxyConnection.handleLookup(ProxyConnection.java:387)
 [org.apache.pulsar-pulsar-proxy-2.5.0.jar:2.5.0]
        at 
org.apache.pulsar.common.protocol.PulsarDecoder.channelRead(PulsarDecoder.java:126)
 [org.apache.pulsar-pulsar-common-2.5.0.jar:2.5.0]
        at 
org.apache.pulsar.proxy.server.ProxyConnection.channelRead(ProxyConnection.java:174)
 [org.apache.pulsar-pulsar-proxy-2.5.0.jar:2.5.0]
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
 [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)
 [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)
 [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:326)
 [io.netty-netty-codec-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:300)
 [io.netty-netty-codec-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
 [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)
 [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)
 [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1478) 
[io.netty-netty-handler-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.handler.ssl.SslHandler.decodeNonJdkCompatible(SslHandler.java:1239) 
[io.netty-netty-handler-4.1.43.Final.jar:4.1.43.Final]
        at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1276) 
[io.netty-netty-handler-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:503)
 [io.netty-netty-codec-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:442)
 [io.netty-netty-codec-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:281)
 [io.netty-netty-codec-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
 [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)
 [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)
 [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1422)
 [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
 [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)
 [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:931)
 [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163)
 [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:700) 
[io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:635)
 [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:552) 
[io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:514) 
[io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1050)
 [io.netty-netty-common-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) 
[io.netty-netty-common-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
 [io.netty-netty-common-4.1.43.Final.jar:4.1.43.Final]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]```
----
2020-06-14 05:16:37 UTC - Sankararao Routhu: We are using zookeeper for service 
discovery from proxy
----
2020-06-14 05:21:02 UTC - Sijie Guo: If you are running k8s, I usually 
recommend using brokerServiceURL for service discovery instead of zookeeper 
discovery.
----
2020-06-14 05:21:37 UTC - Sijie Guo: because of session expires, the zookeeper 
discovery can be a problem.
----
2020-06-14 05:26:40 UTC - Sankararao Routhu: We are not running in K8s @Sijie 
Guo
----
2020-06-14 05:27:10 UTC - Sankararao Routhu: is it failing because we are using 
zookeeper service discovery?
----
2020-06-14 05:27:50 UTC - Sankararao Routhu: We had a challenge in using broker 
service Url so switched to service discovery
----
2020-06-14 05:29:42 UTC - Sankararao Routhu: We have multiple broker aws 
instances which are behind nlb. Proxy was not able to connect to broker nlb
----
2020-06-14 05:30:29 UTC - Sankararao Routhu: Proxy is not able to connect after 
restarting brokers because of zookeeper service discovery @Sijie Guo?
----
2020-06-14 05:37:40 UTC - Sijie Guo: it seems that after restarting brokers, 
the broker cache in proxies became empty. hence proxies are not able to find 
any brokers to connect.

If you have setup a nlb, is that nlb an internal LB or a public LB? Proxy just 
need to connect to the nlb
----
2020-06-14 05:41:19 UTC - Sankararao Routhu: its public nlb
----
2020-06-14 05:41:37 UTC - Sankararao Routhu: if I use nlb in proxy then I am 
getting following error
----
2020-06-14 05:41:40 UTC - Sankararao Routhu: ```2020-06-13 22:39:19,494 -0700 
[pulsar-proxy-io-2-4] WARN  org.apache.pulsar.proxy.server.LookupProxyHandler - 
[<persistent://identity/idm/failovertest2.Queue>] failed to get Partitioned 
metadata : org.apache.pulsar.client.api.PulsarClientException: Connection 
already closed
java.util.concurrent.CompletionException: 
org.apache.pulsar.client.api.PulsarClientException$LookupException: 
org.apache.pulsar.client.api.PulsarClientException: Connection already closed
        at 
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
 ~[?:1.8.0_181]
        at 
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
 ~[?:1.8.0_181]
        at 
java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:647) 
~[?:1.8.0_181]
        at 
java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:632)
 ~[?:1.8.0_181]
        at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) 
[?:1.8.0_181]
        at 
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
 [?:1.8.0_181]
        at 
org.apache.pulsar.client.impl.ClientCnx.handlePartitionResponse(ClientCnx.java:505)
 [org.apache.pulsar-pulsar-client-original-2.5.0.jar:2.5.0]
        at 
org.apache.pulsar.common.protocol.PulsarDecoder.channelRead(PulsarDecoder.java:120)
 [org.apache.pulsar-pulsar-common-2.5.0.jar:2.5.0]
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
 [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)
 [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)
 [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:326)
 [io.netty-netty-codec-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:300)
 [io.netty-netty-codec-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
 [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)
 [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)
 [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1478) 
[io.netty-netty-handler-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.handler.ssl.SslHandler.decodeNonJdkCompatible(SslHandler.java:1239) 
[io.netty-netty-handler-4.1.43.Final.jar:4.1.43.Final]
        at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1276) 
[io.netty-netty-handler-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:503)
 [io.netty-netty-codec-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:442)
 [io.netty-netty-codec-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:281)
 [io.netty-netty-codec-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
 [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)
 [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)
 [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1422)
 [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
 [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)
 [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:931)
 [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163)
 [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:700) 
[io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:635)
 [io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:552) 
[io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:514) 
[io.netty-netty-transport-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1050)
 [io.netty-netty-common-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) 
[io.netty-netty-common-4.1.43.Final.jar:4.1.43.Final]
        at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
 [io.netty-netty-common-4.1.43.Final.jar:4.1.43.Final]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
Caused by: org.apache.pulsar.client.api.PulsarClientException$LookupException: 
org.apache.pulsar.client.api.PulsarClientException: Connection already closed
        at 
org.apache.pulsar.client.impl.ClientCnx.getPulsarClientException(ClientCnx.java:987)
 ~[org.apache.pulsar-pulsar-client-original-2.5.0.jar:2.5.0]
        at 
org.apache.pulsar.client.impl.ClientCnx.handlePartitionResponse(ClientCnx.java:506)
 ~[org.apache.pulsar-pulsar-client-original-2.5.0.jar:2.5.0]
        ... 31 more
2020-06-13 22:39:19,495 -0700 [pulsar-client-shutdown-thread] INFO  
org.apache.pulsar.proxy.server.ProxyConnectionPool - Closing 
ProxyConnectionPool.```
----
2020-06-14 05:42:44 UTC - Sankararao Routhu: Proxy is abled to get Partitioned 
metadata
----
2020-06-14 05:42:50 UTC - Sankararao Routhu: @Sijie Guo
----
2020-06-14 05:44:23 UTC - Sankararao Routhu: I have commented out 
zookeeperServers and configurationStoreServers and provided 
brokerServiceURLTLS, brokerWebServiceURLTLS in proxy.conf
----
2020-06-14 05:59:28 UTC - Sankararao Routhu: Hi @Sijie Guo
----
2020-06-14 05:59:49 UTC - Sankararao Routhu: can you please let me know if 
above config is correct
----
2020-06-14 06:39:46 UTC - Ali Ahmed: 
<https://github.com/debezium/debezium/pull/1538/>
----
2020-06-14 07:18:06 UTC - Asaf Mesika: Say I have a topic containing tasks to 
do, and a shared subscription, with multiple worker machines using it to 
execute those tasks. How can I know the state of the subscription in terms of 
number of unacknowledged messages? (i.e. size of the queue)
----

Slack digest for #general - 2020-06-14

Reply via email to