2020-04-23 09:23:59 UTC - Sijie Guo: It seems that the maven asserts haven’t
been synced to maven central.
----
2020-04-23 09:24:04 UTC - Sijie Guo: It will take a while.
----
2020-04-23 09:54:17 UTC - tuteng:
<https://repository.apache.org/content/repositories/releases/org/apache/pulsar/>
----
2020-04-23 10:46:00 UTC - Fernando Miguélez: thx @tuteng
----
2020-04-23 11:36:08 UTC - Grzegorz Wojnarowski: @Grzegorz Wojnarowski has
joined the channel
----
2020-04-23 12:19:26 UTC - Grzegorz Wojnarowski: Hallo all. I have *Topic
Compaction* problem in Pulsar 2.5.1
----
2020-04-23 12:20:05 UTC - Grzegorz Wojnarowski: I expected that sending empty
message deletes message with given key. In my example there is only one message
per key but there are empty messages.
Is it my fault or misunderstanding?
Below is code fragment and program output.
----
2020-04-23 12:20:33 UTC - Grzegorz Wojnarowski: `topic_name =
'public/default/compacted'`
`for i in range(10):`
`producer.send("msg {}".format(str(i)).encode(),
partition_key=str(i).strip())`
`for i in range(1, 10, 2):`
`producer.send(b'', partition_key=str(i).strip())`
`run_compation(topic_name)`
`state = get_compaction_state(topic_name)`
`while state['status'] == "RUNNING":`
`time.sleep(1)`
`state = get_compaction_state(topic_name)`
`reader = pulsar_client.create_reader(topic_name,
pulsar.MessageId.earliest, is_read_compacted=True)`
`while reader.has_message_available():`
`msg = reader.read_next()`
`print(msg.message_id(), msg.partition_key(), msg.data().decode())`
`(52,130,-1,-1) 0 msg 0`
`(63,1,-1,-1) 2 msg 2`
`(63,3,-1,-1) 4 msg 4`
`(63,5,-1,-1) 6 msg 6`
`(63,7,-1,-1) 8 msg 8`
`(63,8,-1,-1) 9 msg 9`
`(63,9,-1,-1) 1`
`(63,10,-1,-1) 3`
`(63,11,-1,-1) 5`
`(63,12,-1,-1) 7`
`(63,13,-1,-1) 9`
----
2020-04-23 13:32:47 UTC - Christophe Bornet: Hi everyone, from the docs on the
binary protocol
> Message payloads are passed in raw format rather than protobuf format for
efficiency reasons.
Can someone give more details on this ? Where are the performance gains ?
+1 : Frank Kelly
----
2020-04-23 13:36:39 UTC - Huanli Meng: @Huanli Meng has joined the channel
----
2020-04-23 14:20:58 UTC - Manjunath Ghargi: Hi, I was looking for some
reference examples code of how Mutual Auth can be enabled from client and the
server-side? Can someone kindly help?
+1 : Frank Kelly
----
2020-04-23 14:55:04 UTC - Huanli Meng: @Huanli Meng set the channel topic: -
Pulsar 2.5.1 released!
<https://pulsar.apache.org/blog/2020/04/23/Apache-Pulsar-2-5-1/>
- 2020 Pulsar User Survey Report: <https://bit.ly/3d1KsGG>
- Pulsar Summit SF 2020: <https://pulsar-summit.org/>
+1 : matt_innerspace.io, Kirill Kosenko, rani, Rahul, Ben, Yuvaraj Loganathan,
Sijie Guo
tada : Sijie Guo
heart : Sijie Guo
----
2020-04-23 14:59:47 UTC - Huanli Meng: Hi everyone, Pulsar 2.5.1 is released.
For details, check out the release notes:
<https://pulsar.apache.org/release-notes/#251-mdash-2020-04-20-a-id251a>
heart : Frank Kelly, Yuvaraj Loganathan, Sijie Guo
----
2020-04-23 15:06:20 UTC - rani: *[Function Worker Rest API Calls]* Rest API
calls to the following endpoints return `Error 500 Internal Server Error`
```GET /admin/v3/functions/my_tenant_name/my_namespace_name/my_function_name
(Listing functions deployed in given namespace)
GET /admin/v3/functions/connectors (Listing supported pulso IO connectors)```
*Pulsar Component Version*: `2.5.1-candidate-3`
*Architecture*: Function workers running on k8s separately from remaining
components that are managed on EC2 via ASGs. Pulsar proxy was configured to be
aware of the separate function worker cluster by configuring
`functionWorkerWebServiceURL` appropriately.
My understanding is that these issues were fixed through this PR:
<https://github.com/apache/pulsar/pull/6486>. Am I missing something here?
----
2020-04-23 15:52:12 UTC - Gabriel Volpe: @Gabriel Volpe has joined the channel
----
2020-04-23 15:52:19 UTC - Kai Levy: Thanks. It looks like the python client
does not have that method available?
----
2020-04-23 15:53:41 UTC - Gabriel Volpe: Hi all! Quick question related to
<https://pulsar.apache.org/api/client/org/apache/pulsar/client/api/SubscriptionInitialPosition.html>
----
2020-04-23 15:53:58 UTC - Gabriel Volpe: The `Earliest` option says "the
earliest position which means the start consuming position will be the first
message"
----
2020-04-23 15:54:42 UTC - Gabriel Volpe: Does that mean the first message ever,
including those who have been acknowledged? Because that's the behavior I'm
seeing
----
2020-04-23 15:55:04 UTC - Rahul: Yes
----
2020-04-23 15:55:23 UTC - Gabriel Volpe: Thanks for your quick answer!
----
2020-04-23 16:07:19 UTC - Sijie Guo: I think there are some issues related to
compacting empty messages: <https://github.com/apache/pulsar/pull/6795>
It would come as part of 2.5..2. @Penghui Li can you double check?
----
2020-04-23 16:08:14 UTC - Sijie Guo: It avoids serialization and
deserialization. Memory copy kills performance.
----
2020-04-23 16:08:55 UTC - Sijie Guo:
<http://pulsar.apache.org/docs/en/security-tls-authentication/>
----
2020-04-23 16:09:04 UTC - Sijie Guo: This is the documentation for enabling
m-auth.
----
2020-04-23 16:10:00 UTC - Mohit Manaskant: @Mohit Manaskant has joined the
channel
----
2020-04-23 16:14:46 UTC - Sijie Guo: list functions in /v3 endpoint has a
problem - <http://pulsar.apache.org/docs/en/security-tls-authentication/>
but the error code is 404. 500 seems to indicate a different error. did you see
any errors in the proxy or in the broker?
----
2020-04-23 16:25:25 UTC - Mohit Manaskant: Hello, I had the following use case
which wish to try solving using Pulsar. We will be having `clicks` and
`impressions` data flowing through two different topics in Pulsar. Data is in
JSON, and both have `user` field in it and wish to join the data across two
topics. With my limited knowledge of Pulsar and experience using Pulsar, I
could not find a clear answer as to whether it can be achieved using :
1. Apache Pulsar SQL ==> Can i write a join query which will produce the
joined output to an output topic
2. Pulsar Functions ==> Can i model this problem using it? In the common use
cases of Functions like Filtering/Content based routing etc, we don't find any
help article where an example of stateful processing of joining across two
topics can be achieved.
Thanks in advance.
eyes : Konstantinos Papalias
----
2020-04-23 16:51:11 UTC - Christophe Bornet: You mean the ser/deser done by the
protobuf lib ? I think bytes type are encoded as-is by protobuf. So it should
be possible to access them with zero-copy, isn't it ?
----
2020-04-23 17:20:18 UTC - Sijie Guo: No it is not zero-copy in protobuf
----
2020-04-23 17:23:27 UTC - rani: these are the only errors i could find. Not
sure how helpful they are
```function-worker-pod 13:08:35.499 [function-web-21-5] INFO org
.eclipse.jetty.server.RequestLog - 10.242.17.7 - - [23/Apr/202 0:13:08:35
+0000] "GET /admin/v3/functions/my_tenant/my_namesoace/exclamation HTTP/1.1"
500 388 "-" "PostmanRuntime/7.24.0" 6
function-worker-pod 13:08:43.438 [function-web-21-5] ERROR
org.glassfish.jersey.message.internal.WriterInterceptorExecutor -
MessageBodyWriter not found for media type=application/xml,```
----
2020-04-23 17:40:09 UTC - Sijie Guo: @rani: It seems that “GET
/admin/v3/functions/my_tenant/my_namesoace/exclamation HTTP/1.1” 500 “.
Did you check the logs at function worker side?
----
2020-04-23 17:42:00 UTC - rani: hmm, these logs I shared are actually from the
function worker pod itself :disappointed: Not sure where else to look
----
2020-04-23 17:43:58 UTC - rani: On a slightly separate note @Sijie Guo, i’m
invoking the pulsar proxy REST interface with an admin token and and i’m
receiving the following on `/admin/v2/brokers/health`
```Message:
org.apache.pulsar.client.api.PulsarClientException$AuthorizationException:
Authorization failed pulsar-role-anonymous on topic
<persistent://pulsar/pulsar-staging/10.242.160.125:8080/healthcheck> with error
Don't have permission to administrate resources on this tenant```
*proxy.conf:*
`anonymousUserRole`=pulsar-role-anonymous
`authenticationProviders`=org.apache.pulsar.broker.authentication.AuthenticationProviderToken
`brokerClientAuthenticationParameters`=token:aaaaaa _(JWT_ corresponding to
_pulsar-role-proxy_)
*broker.conf*
`proxyRoles`=pulsar-role-proxy
`anonymousUserRole`=pulsar-role-anonymous
`authenticateOriginalAuthData`=false
1. Why am I being perceived as `pulsar-role-anonymous` even though I am using
authenticating with a JWT corresponding to a specific role (in this case
`pulsar-role-admin`)?
2. How can I get the the health check endpoint to work?
----
2020-04-23 18:58:50 UTC - Tolulope Awode: Hi guys
----
2020-04-23 18:59:30 UTC - Tolulope Awode: After a successful deployment of
pulsar on kubernetes cluster, deploying the node pulsar-client has been
failing. The Dockerfile composition failed as cpp needs to be installed and its
not available either as npm or package nugget
----
2020-04-23 18:59:35 UTC - Tolulope Awode: Please help
----
2020-04-23 19:39:48 UTC - Ryan Brereton: Hi everyone-- I’m wanting to
negatively acknowledge a message but in a Pulsar function. I’m wondering if
`context.getCurrentRecord().fail` (Pulsar Functions API) has the same
functionality/result as `consumer.negativeAcknowledge(message)` (Client API)
instead.
Basically does
<http://pulsar.apache.org/api/pulsar-functions/2.4.2/org/apache/pulsar/functions/api/Record.html#fail-->
do the same thing as
<http://pulsar.apache.org/api/client/2.4.2/org/apache/pulsar/client/api/Consumer.html#negativeAcknowledge-org.apache.pulsar.client.api.Message->
?
If so, what’s the redelivery delay when using .fail? I know the default when
n-acking in my consumer is 1 min and can be set via .negativeAckRedeliveryDelay.
----
2020-04-23 20:29:45 UTC - Sijie Guo: > these logs I shared are actually from
the function worker pod itself
How many function worker pods are you running? Did you check all of the them?
> Why am I being perceived as `pulsar-role-anonymous` even
It seems that you didn’t configure this properly. Hence the anonymous role is
used.
``` * If authentication and authorization is enabled(and not sasl) and if
the authRole is one of proxyRoles we want to enforce
* - the originalPrincipal is given while connecting
* - originalPrincipal is not blank
* - originalPrincipal is not a proxy principal ```
----
2020-04-23 21:14:30 UTC - Vladimir Shchur: It's the earliest ever only for a
new subscription, if consumer just reconnects, it will start from the last
acknowledged
----
2020-04-23 21:18:07 UTC - Sijie Guo: It is easier to do in Pulsar SQL. It is
available at this moment.
It is doable in Pulsar Functions as well. But you might have to write some fair
amount of code to achieve that.
----
2020-04-23 21:20:05 UTC - Sijie Guo: record.fail() is basically a wrapper of
`consumer.negativeAcknowledge()`
----
2020-04-23 21:20:42 UTC - Sijie Guo: So the delay is the consumer settings of
negativeAckRedeliveryDelay
----
2020-04-23 22:01:10 UTC - Ryan Brereton: Thank you. Seemed like it, but wanted
to make sure. Appreciate the help!
+1 : Sijie Guo
----
2020-04-23 22:04:52 UTC - Konstantinos Papalias: @Sijie Guo are there any
examples on Pulsar SQL joins?
----
2020-04-23 22:28:02 UTC - Sijie Guo:
<https://www.javahelps.com/2019/11/presto-sql-types-of-joins.html>
----
2020-04-23 22:28:20 UTC - Sijie Guo: You can find quite a lot of examples of
Presto SQL joins.
----
2020-04-24 01:10:24 UTC - Penghui Li: @Grzegorz Wojnarowski The problem is
fixed by <https://github.com/apache/pulsar/pull/6795> and I have add unit test
for the case that you described. You can try it on the master branch.
----
2020-04-24 01:17:15 UTC - Ebere Abanonu: @Sijie Guo if I had my own Security
token service server using IdentityServer4, how can I integrate that with
Pulsar for Authentication and Authorization?
----
2020-04-24 01:59:25 UTC - Sijie Guo: Typically you need to implement a broker
side oauth authentication provider that you can integrate with IdentityServer4.
For the client side, there are two approaches:
1. generate an access_token and use the standard token provider.
2. implement your own authentication provider that integrate with
IdentityServer4.
For authorization, you can use the default zookeeper-based authorization
provider. If you want use IdentityServer4 for authorization, you can implement
your own one as well.
----
2020-04-24 02:07:41 UTC - Ebere Abanonu: Can a demo be given? or any tutorial
on how to achieve this
----
2020-04-24 03:27:32 UTC - Mohit Manaskant: Hey @Sijie Guo, Does `Apache Pulsar
SQL` behaves like `KSQL` , which allows a registered query to run forever (as
KSQL internally turns into Kafka Streams application) and create a stream out
of it or it behaves like a traditional SQL engine, wherein the query executes
only once when fired and returns the data present, but it will not keep on
updating itself automatically.
Thanks in advance.
----
2020-04-24 04:04:40 UTC - Addison Higham: anyone have tips on debugging
bookkeeper heap and direct memory OOMs? have a bookkeeper with `-Xms512m -Xmx1g
-XX:MaxDirectMemorySize=1g` and (using k8s) set with a 2.5Gi limit.
`dbStorage_readAheadCacheMaxSizeMb="256", dbStorage_writeCacheMaxSizeMb="256",
journalMaxSizeMB="2048"` for other relevant memory settings, rest are default.
I was getting heap errors, resized it up to a bit bigger on heap and then got a
direct memory oom, finally worked once I went 2 gb heap and 2 gb direct mem
----
2020-04-24 04:18:54 UTC - magrain: @magrain has joined the channel
----
2020-04-24 05:06:14 UTC - Fayce: Another rookie question: The pulsar
documentation says that the Pulsar schema registry is currently available only
for the Java client. Does it mean that we cannot use schema (Avro or else) if
we have producers/consumers in C++ and Python? i.e we have to
serialize/de-serialize every message sent to/received from Pulsar???
----
2020-04-24 05:09:36 UTC - Sijie Guo: both c++/python support schema registry now
----
2020-04-24 05:09:45 UTC - Sijie Guo: cgo client also supports schema
----
2020-04-24 05:09:51 UTC - Fayce: Hi Sijeg, thanks for the feedback.
c++/python support schema in version 2.5.0?
----
2020-04-24 05:12:02 UTC - Sijie Guo: I think we added the support since 2.3 or
2.4.
----
2020-04-24 05:12:16 UTC - Sijie Guo: The documentation is probably not updated
yet.
----
2020-04-24 05:12:49 UTC - Fayce: ah ok thanks, I was probably reading an old
documentation then. Thanks a lot
----
2020-04-24 05:22:13 UTC - Fayce: I can see python example of how to create a
schema with python but not with c++. Is there some examples in c++ on how to
create a producer with a avro schema for instance?
----
2020-04-24 06:17:45 UTC - Grzegorz Wojnarowski: Thanks
----
2020-04-24 07:36:43 UTC - wuYin: hello guys, I’m a beginner with Pulsar
I want know, how pulsar store topic partition metadata, such as partition count?
I just find bundle owner in /namespace/public/default/0x40000000_0x80000000
but can’t found topic partition count
----
2020-04-24 07:40:49 UTC - wuYin: sorry for this stupid question
I just create partitioned-topic, and find it in
/admin/partitioned-topics/public/default/persistent/tp-1
+1 : Penghui Li
----
2020-04-24 07:40:56 UTC - Penghui Li: partitioned topic metadata is stored in
the Apache Zookeeper. You can try to use `$ bin/pulsar-admin topics
get-partitioned-topic-metadata` to get the metadata.
----
2020-04-24 07:42:37 UTC - wuYin: thanks for reply
----
2020-04-24 07:50:23 UTC - Gabriel Volpe: Mmm this is not the behavior I'm
seeing I think, I'll need to double-check
----
2020-04-24 08:10:00 UTC - Gabriel Volpe: Effectively, what I'm seeing is that
when I restart my application (using `Earliest`) the consumer is closed (I call
`unsubscribe` and `close`) and upon restarting it consumes the messages that
have already been acknowledged. Is this the expected behavior?
----
2020-04-24 08:12:08 UTC - Rahul: @Gabriel Volpe Are you using same subscription
while restarting your application
----
2020-04-24 08:12:45 UTC - Gabriel Volpe: Yes, the subscription is the same
----
2020-04-24 08:14:32 UTC - Ebere Abanonu: @Sijie Guo first attempt -
<https://github.com/eaba/PulsarSTSOAuthProvider/blob/master/src/main/java/org/apache/pulsar/broker/authentication/AuthenticationProviderIdentityServer4.java>
----
2020-04-24 08:14:47 UTC - Rahul: In that scenario it shouldn't happen. Please
verify that the subscription is still present for that topic before restating
your application.
----
2020-04-24 08:15:25 UTC - Gabriel Volpe: You mean the subscription still being
present in Pulsar?
----
2020-04-24 08:15:30 UTC - Ebere Abanonu: That seems to be the way to go to
support more than IdentityServer4 as Token Authentication Provider
----
2020-04-24 08:16:29 UTC - Rahul: Yeah
----
2020-04-24 08:17:10 UTC - Gabriel Volpe: Let me check
----
2020-04-24 08:20:04 UTC - Gabriel Volpe: So yes, I'm looking in Pulsar Express
and the subscription seems to still be around
`<persistent://public/default/auth-subscription>`
----
2020-04-24 08:20:30 UTC - Rahul: ```pulsar-admin topics subscriptions
<topic>```
----
2020-04-24 08:20:35 UTC - Gabriel Volpe: That means the `unsubscribe` command
is not really working right?
----
2020-04-24 08:21:39 UTC - Gabriel Volpe: Can I run that command from within my
Pulsar container?
----
2020-04-24 08:24:31 UTC - Gabriel Volpe: Okay got it
----
2020-04-24 08:24:46 UTC - Gabriel Volpe: ```$ pulsar-admin topics subscriptions
<persistent://public/default/auth>
"<persistent://public/default/auth-subscription>"```
----
2020-04-24 08:24:55 UTC - Gabriel Volpe: It's still around as Pulsar Express
shows
----
2020-04-24 08:32:34 UTC - Gabriel Volpe: Okay I think I found the issue. My
application stops, unsubscribes and closes the consumer but there's another
application still running which keeps this subscription active.
----
2020-04-24 08:32:38 UTC - Gabriel Volpe: Sorry for the trouble
----
2020-04-24 08:37:16 UTC - Christophe Bornet: Even with the modified Pulsar
protoc plugin ?
----
2020-04-24 08:38:15 UTC - Christophe Bornet: I'm trying to see what is possible
since with gRPC I can only use protobuf messages
----
2020-04-24 08:43:30 UTC - Sijie Guo: gRPC doesn’t support zero copy
----
2020-04-24 09:02:05 UTC - Fayce: Hello guys, is there anyone who has a sample
c++ producer with Avro schema? I looked everywhere in the documentation and c++
pulsar client but can't find anything on how to create a schema from c++ client
(only python). Unless I totally misunderstood the documentation and I have to
use directly the Avro c++ library to create my schema? Thanks a lot.
----
2020-04-24 09:06:59 UTC - Ali Ahmed: @Fayce you can get started with this
<https://github.com/apache/pulsar/blob/master/pulsar-client-cpp/tests/SchemaTest.cc>
----
2020-04-24 09:08:30 UTC - Fayce: Hi Ahmed, Thanks a lot. I will have a look at
that. Cheers
----
2020-04-24 09:09:59 UTC - Fayce: Is it already available in c++ client for
pulsar version 2.5.0?
----