2020-06-11 09:12:03 UTC - jujugrrr: Oky, maybe it worth raising an issue on
gitub? @Addison Higham is it something you had to cover for your talk on Pulsar
+ K8S, were you using EKS?
----
2020-06-11 10:40:14 UTC - jujugrrr: Hi all. I'm testing pulsar offloading to
S3. I have a script producing 1M messages and another one reading them. The
reading works well a few times (I re-run from scratch) but then I start to get
exceptions:
```10:28:56.701 [pulsar-io-24-1] INFO
org.apache.bookkeeper.mledger.impl.ManagedCursorImpl -
[ten/ns/persistent/my-topic-reader-3558c16521] Rewind from 233:0 to 233:0
10:28:56.701 [pulsar-io-24-1] INFO
org.apache.pulsar.broker.service.persistent.PersistentTopic -
[<persistent://ten/ns/my-topic>] There are no replicated subscriptions on the
topic
10:28:56.701 [pulsar-io-24-1] INFO
org.apache.pulsar.broker.service.persistent.PersistentTopic -
[<persistent://ten/ns/my-topic>][reader-3558c16521] Created new subscription
for 0
10:28:56.701 [pulsar-io-24-1] INFO org.apache.pulsar.broker.service.ServerCnx
- [/x.x.x.x7:53908] Created subscription on topic
<persistent://ten/ns/my-topic> / reader-3558c16521
10:28:56.705 [bookkeeper-ml-workers-OrderedExecutor-6-0] WARN
org.apache.bookkeeper.mledger.impl.OpReadEntry -
[ten/ns/persistent/my-topic][reader-3558c16521] read failed from ledger at
position:233:0 : Unknown exception
10:28:56.705 [broker-topic-workers-OrderedScheduler-3-0] ERROR
org.apache.pulsar.broker.service.persistent.PersistentDispatcherSingleActiveConsumer
- [<persistent://ten/ns/my-topic> /
reader-3558c16521-Consumer{subscription=PersistentSubscription{topic=<persistent://ten/ns/my-topic>,
name=reader-3558c16521}, consumerId=0, consumerName=,
address=/x.x.x.x7:53908}] Error reading entries at 233:0 : Unknown exception -
Retrying to read in 15.0 seconds```
Those ledgers are getting offloaded to S3. It looks like as soon as the ledger
is removed(set-offload-deletion-lag) from the local storage I'm getting the
exception above.
```10:28:28.452 [bookkeeper-ml-workers-OrderedExecutor-6-0] INFO
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl -
[ten/ns/persistent/my-topic] End TrimConsumedLedgers. ledgers=3
totalSize=38942923
10:28:28.452 [bookkeeper-ml-workers-OrderedExecutor-6-0] INFO
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl -
[ten/ns/persistent/my-topic] Deleting offloaded ledger 233 from bookkeeper -
size: 15432415
10:28:28.452 [bookkeeper-ml-workers-OrderedExecutor-6-0] INFO
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl -
[ten/ns/persistent/my-topic] Deleting offloaded ledger 234 from bookkeeper -
size: 15504438
- size: 16168356```
Also I can see the Ledgers got removed from Zookeeper. Is there a configuration
option I'm missing? Is there a way to understand why `read failed from ledger
at position:233:0 : Unknown exception` is happening? Thank you!
I would expect to still be able to read the messages from the offloaded ledgers
----
2020-06-11 14:41:11 UTC - Arnaud Briche: Hi all,
I'm trying to deploy pulsar standalone on minikube, and it seems like I can
only produce/consume message from the same container as pulsar.
Using kubernetes service dns name does not works.
----
2020-06-11 15:22:33 UTC - Addison Higham: @Marcio Martins @jujugrrr S3
offloading is NOT built on top of the AWS SDK, it uses jclouds. OIDC requires a
new credential provider. jclouds will need to get support for OIDC. We started
in on Pulsar before OIDC was a thing, so we implemented per-pod IAM with kiam.
We also support OIDC but can't fully migrate to it until everything supports
OIDC auth
----
2020-06-11 15:24:54 UTC - jujugrrr: makes sense
----
2020-06-11 16:00:07 UTC - Addison Higham: oh actually, I just looked into this
more @jujugrrr all that needs to happen is we need to bump the aws-java-sdk
version as jclouds does use the credential provider chain from aws-sdk
----
2020-06-11 16:00:24 UTC - jujugrrr: ah, sounds good
----
2020-06-11 16:01:29 UTC - jujugrrr: definitely a great feature for all the EKS
based deployment
----
2020-06-11 16:01:55 UTC - Marcio Martins: Yep, that's what I thought from that
github issue. Will it make it for 2.6.0?
----
2020-06-11 16:03:11 UTC - Matthew Follegot: Hi folks, I have a question
regarding using Pulsar as a message queue.
What happens if I have a small, finite number of consumers and a large
ingestion rate into a Pulsar topic? I understand that Pulsar persists
unacknowledged messages to BookKeeper, but what I don't understand is if these
events will automatically be fetched from BookKeeper at a later time or if they
will sit there indefinitely until they are manually consumed.
Any help is appreciated! Thank you :)
----
2020-06-11 16:03:17 UTC - Addison Higham: the sdk version was bumped already
for an unrelated feature
----
2020-06-11 16:03:28 UTC - Addison Higham: and is in 2.6.0
----
2020-06-11 16:04:54 UTC - Gary Fredericks: by "fetched from" do you mean
"deleted from"?
----
2020-06-11 16:06:06 UTC - jujugrrr: I think that's what you are looking at
<http://pulsar.apache.org/docs/en/cookbooks-retention-expiry/#backlog-quotas>
----
2020-06-11 16:07:17 UTC - Matthew Follegot: I mean consumed and deleted from
----
2020-06-11 16:07:39 UTC - Matthew Follegot: Thank you, I'll look into this!
----
2020-06-11 16:16:00 UTC - Marcio Martins: Top! Thx
----
2020-06-11 17:04:14 UTC - Nicolas Ha: I can now see the page with a v3 selector
:slightly_smiling_face: thanks
----
2020-06-11 20:55:34 UTC - Marcio Martins: Hey guys, I have pumped 26M messages
to pulsar with retention on, and offloaded almost everything to S3. Now I have
3 readers consuming, but every few seconds I get this exception:
```20:49:14.325 [bookkeeper-ml-workers-OrderedExecutor-0-0] ERROR
org.apache.bookkeeper.common.util.SafeRunnable - Unexpected throwable caught
java.lang.NullPointerException: null
at
org.apache.bookkeeper.mledger.impl.OpReadEntry.lambda$readEntriesFailed$0(OpReadEntry.java:88)
~[org.apache.pulsar-managed-ledger-2.5.1.jar:2.5.1]
at
org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun(SafeRun.java:32)
~[org.apache.pulsar-managed-ledger-2.5.1.jar:2.5.1]
at
org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36)
[org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[?:1.8.0_242]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[?:1.8.0_242]
at
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
[io.netty-netty-common-4.1.45.Final.jar:4.1.45.Final]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242]```
Does anybody know if my readers are still getting every message correctly, or
actually, what is the impact of this exception?
Secondly, the rate at which it's consuming is quite slow at ~150Mbit/s. At one
point the first reader which I started earlier than the other 2, was starved
for 1 minute with no messages at all. I assume this is due to fetching the
offloaded ledgers from S3, but even so, 1 minute with no messages seems like
something else is going on...
----
2020-06-11 21:43:23 UTC - Jeff Schneller: I created a tenant called `my-tenant`
and a namespace called `dev` using the pulsar-admin cli. When viewing in the
pulsar-admin cli I can see the tenant and namespace. I can also do
`pulsar-admin namespaces policies my-tenant/dev` and I get a result back. Now
when I use a java client to create a producer to a topic that doesn't exist (I
expect the topic to be created) using this code:
`client.newProducer(org.apache.pulsar.client.api.Schema.JSON(MyObject.class)).topic("<persistent://my-tenant/dev/my-topic>").create();`
I am getting an error of:
`org.apache.pulsar.client.api.PulsarClientException$BrokerMetadataException:
Policies not found for my-tenant/dev namespace`
Any ideas on what is going on? What did I miss in the namespace creation?
----
2020-06-11 21:44:53 UTC - Frank Kelly: Did you turn on
`authorizationEnabled=true` ?
----
2020-06-11 21:45:47 UTC - Jeff Schneller: yes it is turned on. Do I need to
set role access?
I thought if I didn't have any role access then all were allowed.
----
2020-06-11 21:47:13 UTC - Frank Kelly: No I _*think*_ if you turn on default
Authorization Plugin then basically you have to authorize access to each
tenant/namespace/topic - sorry I don't more - still learning about Auth*n
myself. Possibly @Sijie Guo can be definitive?
----
2020-06-11 21:49:23 UTC - Jeff Schneller: Ok I can try that. I am still
learning myself. I was hoping to set the authorize access at the topic level
only so I don't give all roles produce/consume at the namespace and then forget
for a certain topic. I can certainly grant produce consume for all roles at
the namespace and then produce or consume at the topic level
----
2020-06-11 21:50:44 UTC - Sijie Guo: what did you get `pulsar-admin namespaces
policies my-tenant/dev`?
----
2020-06-11 21:52:27 UTC - Jeff Schneller: `{`
`"auth_policies" : {`
`"namespace_auth" : { },`
`"destination_auth" : { },`
`"subscription_auth_roles" : { }`
`},`
`"replication_clusters" : [ "pulsar-cluster-1" ],`
`"bundles" : {`
`"boundaries" : [ "0x00000000", "0x40000000", "0x80000000", "0xc0000000",
"0xffffffff" ],`
`"numBundles" : 4`
`},`
`"backlog_quota_map" : {`
`"destination_storage" : {`
`"limit" : -1073741824,`
`"policy" : "producer_request_hold"`
`}`
`},`
`"clusterDispatchRate" : { },`
`"topicDispatchRate" : {`
`"pulsar-cluster-1" : {`
`"dispatchThrottlingRateInMsg" : 0,`
`"dispatchThrottlingRateInByte" : 0,`
`"relativeToPublishRate" : false,`
`"ratePeriodInSecond" : 1`
`}`
`},`
`"subscriptionDispatchRate" : {`
`"pulsar-cluster-1" : {`
`"dispatchThrottlingRateInMsg" : 0,`
`"dispatchThrottlingRateInByte" : 0,`
`"relativeToPublishRate" : false,`
`"ratePeriodInSecond" : 1`
`}`
`},`
`"replicatorDispatchRate" : { },`
`"clusterSubscribeRate" : {`
`"pulsar-cluster-1" : {`
`"subscribeThrottlingRatePerConsumer" : 0,`
`"ratePeriodInSecond" : 30`
`}`
`},`
`"publishMaxMessageRate" : { },`
`"latency_stats_sample_rate" : { },`
`"message_ttl_in_seconds" : 0,`
`"deleted" : false,`
`"encryption_required" : false,`
`"subscription_auth_mode" : "None",`
`"max_producers_per_topic" : 0,`
`"max_consumers_per_topic" : 0,`
`"max_consumers_per_subscription" : 0,`
`"compaction_threshold" : 0,`
`"offload_threshold" : -1,`
`"schema_auto_update_compatibility_strategy" : "Full",`
`"schema_compatibility_strategy" : "UNDEFINED",`
`"is_allow_auto_update_schema" : true,`
`"schema_validation_enforced" : false`
`}`
----
2020-06-11 21:53:15 UTC - Jeff Schneller: I just tried to set the permissions
and received "Authorization is not enabled" so I guess I don't have
authorization turned on. It should have been.
----
2020-06-11 22:05:19 UTC - Jeff Schneller: Before I turn on authorization, I
would like to understand why I can't create a subscriber without authorization.
----
2020-06-11 23:01:05 UTC - Sijie Guo: Is this standalone or a cluster?
----
2020-06-11 23:03:43 UTC - Jeff Schneller: Cluster but only one broker is turned
on right now.
----
2020-06-12 01:17:28 UTC - Jeff Schneller: I am an idiot. I had a typo in the
tenant name. I really need to get some glasses.
----
2020-06-12 01:18:42 UTC - Luke Stephenson: Is there an image built from master
which has the fix?
----
2020-06-12 01:20:19 UTC - Luke Stephenson: Does anyone have S3 offloading
working with a region other than us-east-1? Would love to know how you got it
working
----
2020-06-12 01:21:28 UTC - Luke Stephenson: I'm blocked in this issue
<https://github.com/apache/pulsar/issues/3833|https://github.com/apache/pulsar/issues/3833>
----
2020-06-12 01:28:17 UTC - Penghui Li: @Luke Stephenson 2.6.0 rc is out and will
be released soon
----
2020-06-12 05:13:11 UTC - Anup Ghatage: @Anup Ghatage has joined the channel
----
2020-06-12 06:04:46 UTC - Sijie Guo: @Luke Stephenson - What is your
configuration?
----
2020-06-12 06:05:28 UTC - Sijie Guo: @xiaolong.ran - did verify S3 offloading
using 2.5.1 before.
----
2020-06-12 07:54:42 UTC - Daniel Ciocirlan: @Daniel Ciocirlan has joined the
channel
----
2020-06-12 07:57:32 UTC - Daniel Ciocirlan: hey, question, if we have 4
geo-replicated standalone pulsar clusters in different regions, one region is
producing and other regions are consuming, do i have to create the
tenant/namespace/partitioned topic in every region, or just in the producing
region. Is the tenant/namespace/partitioned-topic information replicated to
other regions ?
----
2020-06-12 08:39:13 UTC - Dhakshin: Hi
I am running standalone pulsar (on my laptop as a single node).
I want to see below pulsar functions related metrics in prometheus. Which
property and file need modified?
*pulsar_function_last_invocation*
*pulsar_function_system_exceptions_total*
*pulsar_function_user_exceptions_total*
*pulsar_function_process_latency_ms*
*pulsar_function_received_total*
*pulsar_function_processed_successfully_total*
I referred below link [topics as Pulsar Functions]
<http://pulsar.apache.org/docs/en/next/reference-metrics/>
----