2020-09-11 09:12:24 UTC - Rolf Arne Corneliussen: Is broker-to-broker 
communication restricted to geo-replication?
----
2020-09-11 09:45:11 UTC - Vil: what does you mean?
----
2020-09-11 10:00:42 UTC - Rolf Arne Corneliussen: I could also ask: Do brokers 
communicate with other brokers within the same cluster?
----
2020-09-11 10:01:16 UTC - Vil: i think no they do not communicate
----
2020-09-11 10:04:27 UTC - Rolf Arne Corneliussen: Ok, thanks. There is a 
section in the `broker.conf` file that suggest otherwise, but the in-line 
documentation might be wrong or out-dated:
`# Authentication settings of the broker itself. Used when the broker connects 
to other brokers,`
`# either in same or other clusters`
`brokerClientTlsEnabled=false`
`brokerClientAuthenticationPlugin=`
`brokerClientAuthenticationParameters=`
`brokerClientTrustCertsFilePath=`
----
2020-09-11 10:08:00 UTC - Vil: not sure then. i have see this setting for the 
pulsar proxy <https://pulsar.apache.org/docs/en/administration-proxy/>
----
2020-09-11 10:08:24 UTC - Vil: maybe @Addison Higham knows?
----
2020-09-11 10:21:38 UTC - Rolf Arne Corneliussen: Yes, the reason for asking is 
that I want to use a pulsar proxy in front of each cluster, and I want to have 
a different authentication mechanism when connecting to a remote cluster than 
the brokers within a cluster are configured to require from connecting clients. 
If brokers communicate within the same cluster, this does not seem to possible.
----
2020-09-11 13:35:58 UTC - Frank Kelly: Brokers DO need to talk to brokers 
within the same cluster so that they can direct the incoming caller to the 
right broker which maintains the Topic/Partition.
+1 : Vil
----
2020-09-11 13:39:45 UTC - Frank Kelly: See for example 
<https://github.com/apache/pulsar/blob/dba0b65a2fd03586cb3d17a306fbda5322318d22/pulsar-broker/src/main/java/org/apache/pulsar/broker/lookup/TopicLookupBase.java#L82-L95>
----
2020-09-11 13:49:43 UTC - Vil: thanks @Frank Kelly. my mistake. i thought about 
bookeeper bookies. not pulsar brokers
----
2020-09-11 13:50:14 UTC - Frank Kelly: No worries
+1 : Vil
----
2020-09-11 14:42:39 UTC - Addison Higham: The consumer api exposes a channel 
based interface, you can see an example in the tests here:
<https://github.com/apache/pulsar-client-go/blob/master/pulsar/consumer_test.go#L457>

That then allows you to write a more common select loop somewhat like this 
example: <https://tour.golang.org/concurrency/5>
+1 : Toktok Rambo
----
2020-09-11 14:45:46 UTC - Addison Higham: @Rolf Arne Corneliussen there are 
some strategies for this situation, the authentication providers API is 
chained, so you can provide multiple implementations. You can then configure 
your brokers to use a different auth plugin from your remote from your external 
clients and the brokers can validate either
+1 : Vil
----
2020-09-11 15:01:57 UTC - Rolf Arne Corneliussen: Thanks for your input @Frank 
Kelly. I am not familiar with the code, but it is not obvious to me that the 
one broker communicates with another to provide a LookupResult? I though the 
look up process only involves getting metadata from ZooKeeper, but I may be 
mistaken.
----
2020-09-11 15:12:54 UTC - Rolf Arne Corneliussen: Thanks @Addison Higham. I 
will try 
`authenticationProviders=org.apache.pulsar.broker.authentication.AuthenticationProviderToken,org.apache.pulsar.broker.authentication.AuthenticationProviderTls.`
 And 
`brokerClientAuthenticationPlugin=org.apache.pulsar.client.impl.auth.AuthenticationTls`.
 I need the latter for geo-replication. If a broker communicates with a peer 
within the same cluster, I guess it will use TLS client auth, as there is no 
settings to get it to use tokens. But I have tried to use token authentication 
in `conf/client.conf` and that works with `pulsar-admin` and `pulsar-client`.
----
2020-09-11 15:22:28 UTC - Addison Higham: yes, we should perhaps have an 
additional set of settings for replication clients to use a different client, 
but at the moment they share the same props
----
2020-09-11 15:27:09 UTC - Rolf Arne Corneliussen: Thanks for helping out, 
@Addison Higham . I might create an issue/PR on github if this turns out to be 
essential for our setup.
----
2020-09-11 16:17:13 UTC - Evan Furman: following back up here. Would love to 
get topic level metrics working
----
2020-09-11 16:29:15 UTC - Sijie Guo: Sorry that I missed the message.
----
2020-09-11 16:29:27 UTC - Sijie Guo: The output doesn’t look like a broker 
metric
----
2020-09-11 16:29:48 UTC - Sijie Guo: Are you sure that you are curling the 
`/metrics` endpoint of brokers?
----
2020-09-11 16:30:00 UTC - Evan Furman: double checking now
----
2020-09-11 16:33:58 UTC - Evan Furman: those are the node exporter metrics
----
2020-09-11 16:35:28 UTC - Evan Furman: ```efurman-mbp15 ansible-ops 
[feature/pulsar●] curl <http://10.3.21.238:8080/metrics>                       
efurman-mbp15 ansible-ops [feature/pulsar●] ```
----
2020-09-11 16:38:23 UTC - Evan Furman: It looks like that should be right 
according to <https://pulsar.apache.org/docs/en/reference-metrics/#broker>
----
2020-09-11 16:39:05 UTC - Evan Furman: ```[ec2-user@ip-10-3-21-238 ~]$ grep 
webServicePort /opt/pulsar/conf/broker.conf 
webServicePort=8080
webServicePortTls=8443```
----
2020-09-11 16:40:35 UTC - Evan Furman: ```[ec2-user@ip-10-3-21-238 ~]$ grep -i 
metric /opt/pulsar/conf/broker.conf | grep -v '#'
replicationMetricsEnabled=true
exposeTopicLevelMetricsInPrometheus=true
exposeConsumerLevelMetricsInPrometheus=true```
----
2020-09-11 16:54:12 UTC - Sijie Guo: Can you go to the broker pods and curl the 
`/metrics` endpoint locally?
----
2020-09-11 16:54:42 UTC - Evan Furman: we’re not running in k8s — we are using 
terraform/ansible to provision on ec2
----
2020-09-11 16:55:25 UTC - Evan Furman: ```[ec2-user@ip-10-3-21-238 ~]$ curl 
localhost:8080/metrics
[ec2-user@ip-10-3-21-238 ~]$ ```
----
2020-09-11 16:55:33 UTC - Evan Furman: this is from a broker ^
----
2020-09-11 17:11:28 UTC - Caleb Epstein: I'm a bit confused about the 
difference between (in the C++ API) the Client.subscribe(string, ...) and 
Client.subscribe(vector&lt;string&gt; ...) methods?  If I try to subscribe for 
N topics using N subscribe calls (and N Consumers), is this fundamentally 
different than doing one .subscribe passing a vector of length N?  I seem not 
be be getting any messages in the vector mode.  Having a single Consumer to 
call .receieve makes my implementation simpler (only one thing to call 
`.receive` on) but I'm not getting any data.
----
2020-09-11 17:12:49 UTC - Brian Li: @Brian Li has joined the channel
----
2020-09-11 17:17:17 UTC - Brian Li: Hi experts, I'm new to pulsar. I'm on 
2.5.2. Recently I found that when I used jmap -histo:live and found the top 
hits are from pulsar. Do you have any idea on why it has so many instances?
----
2020-09-11 17:18:52 UTC - Evan Furman: Yea it doesn’t seem like we’re getting 
broker metrics
----
2020-09-11 17:43:54 UTC - jason_webb: @jason_webb has joined the channel
----
2020-09-11 17:55:20 UTC - Addison Higham: Is this in your application with the 
pulsar-client? If so, how many producers and consumers are you using?
----
2020-09-11 18:24:53 UTC - jason_webb: Hi!  I am new to pulsar and looking at 
geo-replication and subscription types to see how we can solve a few problems.  
 From the docs it was a little hard to tell how replicated subscriptions work 
with different subscription types (key_shared)?
----
2020-09-11 18:37:56 UTC - Addison Higham: replicated subscriptions simply 
snapshot their state periodically and replicate it. It is useful in failover 
scenarios, but does not allow of a true "global" subscription.

What problems are you looking to solve?
----
2020-09-11 19:05:40 UTC - jason_webb: Thanks for the quick response!  We were 
looking at how to replicate data from one system to another. Writes could be 
active/active across two regions.   I was interested to see if there was a way 
to be able to consume from two regions but shared the data based on a key, so 
the keyspace is distributed and pinned to the consuming regions.  Also, if the 
region goes down, be able to reassign the keys.  I guess the key_shared 
functionality but with global distribution.
----
2020-09-11 19:24:21 UTC - Addison Higham: Ah I think your colleague Amit was 
asking me earlier today along these same lines.

One thing I didn't mention to him:

do you know which keys should go to which region when you produce? if so, it 
should be possible to write your own partitioner on the producer, then the 
consumer would only read from the partitions it needs to for it's regions, 
then, if it can't connect to it's own region's cluster, it could fallback to 
reading from the other cluster.

But, it is possible to share a subscription across regions if you have your  
zookeeper and bookkeeper clusters cross regions and then just brokers in each 
region. That is doable, but does take a fair bit of tuning to get stable, which 
is something that we could help with
----
2020-09-11 19:46:28 UTC - Caleb Epstein: Actually it appears that if the same 
topic appears more than once in that vector, the whole subscription is stuck in 
the way I describe.  Need to make sure the list of topics is unique.
----
2020-09-11 23:34:41 UTC - Mehdi Laouichi: @Mehdi Laouichi has joined the channel
----
2020-09-11 23:50:37 UTC - Mehdi Laouichi: My brokers (2.5.1) encounter errors 
on a specific topic partition with "Failed readOffloaded". I have confirmed the 
ledger was NOT offloaded to S3 for some reason. The ledger is still readable 
from the bookies. I tried skipping messages to move the subscription cursor to 
the next ledger but it eventually fails a couple of ledgers later.

Any idea on how to unblock the situation? (Trace in replies)
----
2020-09-11 23:51:08 UTC - Mehdi Laouichi: ```23:37:47.908 [pulsar-io-26-2] INFO 
 
org.apache.pulsar.broker.service.persistent.PersistentDispatcherMultipleConsumers
 - [<persistent://xxx/xxx/topic_name-partition-1> / SubscriptionName] Retrying 
read operation
23:37:47.986 [offloader-OrderedScheduler-0-0] ERROR 
org.apache.bookkeeper.mledger.offload.jcloud.impl.BlobStoreManagedLedgerOffloader
 - Failed readOffloaded: 
java.lang.NullPointerException: null
        at 
org.apache.bookkeeper.mledger.offload.jcloud.impl.BlobStoreManagedLedgerOffloader.lambda$new$1(BlobStoreManagedLedgerOffloader.java:153)
 ~[?:?]
        at 
org.apache.bookkeeper.mledger.offload.jcloud.impl.BlobStoreBackedReadHandleImpl.open(BlobStoreBackedReadHandleImpl.java:196)
 ~[?:?]
        at 
org.apache.bookkeeper.mledger.offload.jcloud.impl.BlobStoreManagedLedgerOffloader.lambda$readOffloaded$5(BlobStoreManagedLedgerOffloader.java:556)
 ~[?:?]
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[?:1.8.0_252]
        at 
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
 [com.google.guava-guava-25.1-jre.jar:?]
        at 
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57)
 [com.google.guava-guava-25.1-jre.jar:?]
        at 
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
 [com.google.guava-guava-25.1-jre.jar:?]
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[?:1.8.0_252]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
[?:1.8.0_252]
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
 [?:1.8.0_252]
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
 [?:1.8.0_252]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_252]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_252]
        at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
 [io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252]
23:37:48.016 [bookkeeper-ml-workers-OrderedExecutor-1-0] ERROR 
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - 
[xxx/xxx/persistent/topic_name-partition-1] Error opening ledger for reading at 
position 25424:0 - org.apache.bookkeeper.mledger.ManagedLedgerException: 
Unknown exception
23:37:48.016 [bookkeeper-ml-workers-OrderedExecutor-1-0] WARN  
org.apache.bookkeeper.mledger.impl.OpReadEntry - 
[xxx/xxx/persistent/topic_name-partition-1][SubscriptionName] read failed from 
ledger at position:25424:0 : Unknown exception
23:37:48.016 [bookkeeper-ml-workers-OrderedExecutor-1-0] ERROR 
org.apache.pulsar.broker.service.persistent.PersistentDispatcherMultipleConsumers
 - [<persistent://xxx/xxx/persistent/topic_name-partition-1> / 
SubscriptionName] Error reading entries at 25424:0 : Unknown exception, Read 
Type Normal - Retrying to read in 55.893 seconds```
----
2020-09-11 23:51:24 UTC - Mehdi Laouichi: ```$&gt; pulsarctl bookkeeper ledger 
get 25424
 
{
  "25424": {
    "storeCtime": false,
    "hasPassword": false,
    "metadataFormatVersion": 3,
    "ensembleSize": 2,
    "writeQuorumSize": 2,
    "ackQuorumSize": 2,
    "length": 2875,
    "lastEntryId": 4,
    "ctime": 1599309659578,
    "cToken": 0,
    "state": "CLOSED",
    "digestType": "CRC32C",
    "allEnsembles": {
      "0": [
        {
          "port": 3181,
          "hostname": "xxx-3.us-west-2-bookie.pulsar.svc.cluster.local"
        },
        {
          "port": 3181,
          "hostname": "xxx-4.us-west-2-bookie.pulsar.svc.cluster.local"
        }
      ]
    },
    "currentEnsemble": null,
    "password": "",
    "customMetadata": {
      "application": "cHVsc2Fy",
      "component": "bWFuYWdlZC1sZWRnZXI=",
      "pulsar/managed-ledger": 
"c3BlbmRsYWJzLXNhcy1kZXYvcGd3L3BlcnNpc3RlbnQvdHJhbnNhY3Rpb25zX2NsZWFyZWQtcGFydGl0aW9uLTE="
    }
  }
}```
----
2020-09-11 23:51:59 UTC - Addison Higham: @Mehdi Laouichi what does 
`pulsar-admin topics stats-internal &lt;topic&gt;` give you?
----
2020-09-11 23:52:41 UTC - Addison Higham: that will tell you if pulsar thinks 
it is offloaded or not. Also, there is a thing called "offload deletion lag" 
which is a period where the ledger has been offloaded but not yet deleted from 
bookkeeper
----
2020-09-11 23:54:14 UTC - Mehdi Laouichi: 
<https://gist.github.com/sruon/9968781ccdd10077a8698c395d632967>
----
2020-09-12 00:09:20 UTC - Addison Higham: hrm... perhaps the format of 2.5.1 
doesn't tell if the ledgers were offloaded, there is an `offloaded: true/false` 
for each ledger version in what I have locally (2.6.1). The easiest way I can 
to think of getting that data then would be to look at zookeeper data
----
2020-09-12 00:13:11 UTC - Mehdi Laouichi: any idea how it would look in 
zookeeper? Been browsing the tree but not sure where to look for it
----
2020-09-12 00:13:38 UTC - Addison Higham: if you can get onto one of the 
zookeeper boxes, you can run `pulsar zookeeper-shell` and then in the 
`/managed-ledgers/&lt;tenant&gt;/&lt;namespace&gt;/persistent/&lt;topic&gt;` it 
should have a record there I do believe
----
2020-09-12 05:50:15 UTC - Sijie Guo: you need to run `curl -L 
localhost:8080/metrics`
----

Reply via email to