2019-09-25 12:33:10 UTC - Greg Hoover: @Greg Hoover has joined the channel
----
2019-09-25 12:41:27 UTC - Bo Han: @Bo Han has joined the channel
----
2019-09-25 12:50:12 UTC - Bo Han: @Sijie Guo When running pulsar binary in 
standalone mode with --bookkeeper-dir changed to another folder, I still see 
pulsar creating data folder inside `/data/bookkeeper`.  This folder is used by 
`data.bookkeeper.ranges`, which can be configured in bookkeeper package 
downloaded from apache bookkeeper but not pulsar. Is there a way to fix it?
----
2019-09-25 13:42:13 UTC - Rajiv Abraham: Hi, I’m running debezium on pulsar to 
read from a mysql database instance as  specified in the excellent tutorial on 
debezium. In my Python consumer, on running `msg.value()` or `msg.data()`, I 
get a byte string which I can’t decode as UTF-8 or ascii. Can anyone help me 
figuring out how to decode it?
```
b' \x00\x00\x00\xa8 
{"schema":{"type":"struct","fields":[{"type":"int32","optional":false,"field":"id"}],"optional":false,"name":"dbserver1.inventory.customers.Key"},"payload":{"id":1001}}
 \x00\x00\x08\x00 
{"schema":{"type":"struct","fields":[{"type":"struct","fields" >>> 
ELIDED FOR CLARITY  
>>>>{"version":"0.9.2.Final","connector":"mysql","name":"dbserver1","server_id":0,"ts_sec":0,"gtid":null,"file":"mysql-bin.000003","pos":868,"row":0,"snapshot":true,"thread":null,"db":"inventory","table":"customers","query":null},"op":"c","ts_ms":1569418671780}}'
```
----
2019-09-25 15:15:51 UTC - tuteng: You can try deserializing with the following 
code: import pulsar
import json
import struct

client = pulsar.Client('<pulsar://127.0.0.1:6650>')
consumer = client.subscribe('dbserver1.inventory.products', 
subscription_name='sub')

while True:
        msg = consumer.receive()
        print("Received message: '%s'" % msg.data())
        data = msg.data()
        b = struct.unpack("&gt;l", data[0:4])
        print("Key length: " + str(b[0]))
        print("Key data: " + data[4: 4+b[0]])
        print("Json loads +++++++++++++")
        print(json.loads(data[4: 4 + b[0]]))
        c = struct.unpack("&gt;l", data[4 + b[0]: 4 + b[0] + 4])
        print("Value length: " + str(c[0]))
        value = data[4 + b[0] + 4: 4 + b[0] + 4 + c[0]]
        print("Value data: " + value)
        print("Json loads +++++++++++++")
        print(json.loads(value))

client.close()
----
2019-09-25 15:29:46 UTC - Rajiv Abraham: @tuteng Thank you kindly. Is this 
something I have to do for every CDC source? I’m wondering if it’ll be 4 bytes 
for every debezium data source? Do you know what that initial string is?
----
2019-09-25 15:34:07 UTC - Christophe Bornet: Hi, it seems Kafka will soon ditch 
Zookeeper in favor of a self-managed quorum 
(<https://cwiki.apache.org/confluence/display/KAFKA/KIP-500%3A+Replace+ZooKeeper+with+a+Self-Managed+Metadata+Quorum>).
 Is there any plan to do the same in Pulsar ? I understand it would probably be 
more work since it would need to be done in both BookKeeper and Pulsar but I 
think it would be very nice to have one less cluster to install and manage.
----
2019-09-25 15:47:39 UTC - Matteo Merli: Yes, we do have a plan for that, in 
multiple phases. Some work on that front was already started and then paused 
some time back for lack of time bandwidth, though we hope to get that effort 
back moving soon.

&gt; Kafka will soon ditch Zookeeper

My understanding is that it will take a while to get there
+1 : David Kjerrumgaard, Luke Lu, Christophe Bornet
heart_eyes : Poule, Christophe Bornet
----
2019-09-25 16:40:50 UTC - Ben Hood: Hi, Are there any examples of how to 
configure log4cxx with the C API? I’d like to configure the Pulsar client not 
to log so much to stdout, but it doesn’t appear to pick up a 
`log4cxx.properties` or `log4cxx.conf` from the `cwd`
----
2019-09-25 16:43:38 UTC - Matteo Merli: By default, the log4cxx is disabled at 
compile time. Did you compile by passing `-DUSE_LOG4CXX=1` to cmake?
----
2019-09-25 16:46:29 UTC - Ben Hood: No, I used the 
`apache-pulsar-client-devel.x86_64` package for Centos
----
2019-09-25 16:46:51 UTC - Ben Hood: Should I try to compile the client myself 
instead?
----
2019-09-25 16:48:00 UTC - Matteo Merli: Yes, we have disabled Log4Cxx by 
default since it brings in a lot of dependencies
----
2019-09-25 16:49:48 UTC - Matteo Merli: You can use the Docker image 
`apachepulsar/pulsar-build:centos-7` which comes with all the dependencies
----
2019-09-25 16:50:07 UTC - Matteo Merli: It’s the same image that’s used build 
the Rpms
----
2019-09-25 16:57:06 UTC - Ben Hood: OK, I’ll try that
----
2019-09-25 16:57:42 UTC - Addison Higham: @Rajiv Abraham this is how the 
debezium source (currently) encodes the key and the value, that code lives 
here: 
<https://github.com/apache/pulsar/blob/master/pulsar-client-api/src/main/java/org/apache/pulsar/common/schema/KeyValue.java#L97>
----
2019-09-25 16:57:45 UTC - Addison Higham: if you want the details
----
2019-09-25 16:58:18 UTC - Ben Hood: BTW is there any other way to trim what the 
client logs to stdout other than linking against Log4Cxx?
----
2019-09-25 16:58:32 UTC - Addison Higham: alternate encodings are possible with 
pulsar and of that KeyValue class, but currently there isn't a way to change 
that out of the box
----
2019-09-25 17:02:25 UTC - Matteo Merli: I was checking since I didn’t 
remember.. :slightly_smiling_face:
----
2019-09-25 17:02:26 UTC - Rajiv Abraham: @Addison Higham Thank you so much for 
digging that source code. I see that they add four bytes, so it’s debezium 
specific and I can use @tuteng code above for debezium. I’m just curious why 
they add the bytes. I guess I’ll have to ask them. Thanks again
----
2019-09-25 17:02:55 UTC - Matteo Merli: the simple logger is hardcoded to INFO 
level
----
2019-09-25 17:03:11 UTC - Matteo Merli: though it’s possible to pass in a 
callback and do your own logging
----
2019-09-25 17:03:46 UTC - Matteo Merli: See `ClientConfiguration::setLogger()`
----
2019-09-25 17:05:03 UTC - Matteo Merli: …or 
`pulsar_client_configuration_set_logger()` if you use C
----
2019-09-25 18:07:59 UTC - Addison Higham: the bytes are the length of the key 
and the value
----
2019-09-25 18:08:49 UTC - Rajiv Abraham: oh ok, thanks! appreciate it.
----
2019-09-25 18:19:41 UTC - Matteo Merli: @Jesse Zhang (Bose) As an update, I was 
able to reproduce locally and get (in some way) the topic in same state. 
Working on a fix
----
2019-09-25 19:02:37 UTC - Jesse Zhang (Bose): great!
----
2019-09-25 19:27:43 UTC - Jesse Zhang (Bose): I can also reproduce this using 
python client
Producer P produces messsages (60-69) slowly every 2s
consumer(I use python this time) C consume them, but not ack, with ack timeout 
setting to 10s

consumer code:
```
consumer = client.subscribe(consumer_name= "jesse-consumer", topic=[
      '<persistent://xxxx/xxxx-core/notify.account-removed>'
       ], subscription_name= 'xxx-xxxx-local', consumer_type= 
ConsumerType.Shared, unacked_messages_timeout_ms=10000)


while True:
    msg = consumer.receive()
    print("Received message id='{}'".format(msg.message_id()))
    #consumer.acknowledge(msg)
    
client.close()

```

this is the log in client:


Received message id=‘(10,60,-1,-1)’
Received message id=‘(10,61,-1,-1)’
Received message id=‘(10,62,-1,-1)’
Received message id=‘(10,63,-1,-1)’
Received message id=‘(10,64,-1,-1)’
Received message id=‘(10,65,-1,-1)’
Received message id=‘(10,66,-1,-1)’
Received message id=‘(10,67,-1,-1)’
Received message id=‘(10,68,-1,-1)’
2019-09-25 15:10:47.037 INFO  UnAckedMessageTrackerEnabled:48 | [Muti Topics 
Consumer: TopicName - 
<persistent://xxxx/xxxx-core/notify.account-removed-TopicsConsumerFakeName-c19975>
 - Subscription - svc-xxxx-xxx-sub-local]: 4 Messages were not acked within 
10000 time
`//Jesse's comment:  from /stats, "unackedMessages": 10`
Received message id=‘(10,60,-1,-1)’
Received message id=‘(10,61,-1,-1)’
Received message id=‘(10,62,-1,-1)’
Received message id=‘(10,63,-1,-1)’
Received message id=‘(10,64,-1,-1)’
Received message id=‘(10,65,-1,-1)’
Received message id=‘(10,66,-1,-1)’
Received message id=‘(10,67,-1,-1)’
Received message id=‘(10,68,-1,-1)’
Received message id=‘(10,69,-1,-1)’
2019-09-25 15:11:07.044 INFO  UnAckedMessageTrackerEnabled:48 | [Muti Topics 
Consumer: TopicName - 
<persistent://xxxxx/xxx-core/notify.account-removed-TopicsConsumerFakeName-c19975>
 - Subscription - svc-xxx-xxx-sub-local]: 10 Messages were not acked within 
10000 time
`//Jesse's comment:  from /stats, "unackedMessages": 1`
Received message id=‘(10,69,-1,-1)’
2019-09-25 15:11:27.052 INFO  UnAckedMessageTrackerEnabled:48 | [Muti Topics 
Consumer: TopicName - 
<persistent://xxxx/xxx-core/notify.account-removed-TopicsConsumerFakeName-c19975>
 - Subscription - svc-xxx-xxx-sub-local]: 1 Messages were not acked within 
10000 time
`//Jesse's comment:  from /stats, "unackedMessages": 0`
```
----
2019-09-25 19:29:21 UTC - Matteo Merli: Yes, I’m using Python client as well
----
2019-09-25 19:30:00 UTC - Matteo Merli: though this only happens under certain 
conditions, still narrowing down exactly which ones
----
2019-09-25 19:40:28 UTC - Jesse Zhang (Bose): please guide us on any 
workaround. I can upgrade to new client ( we use go client) , but I am not able 
to patch our server which is currently on 2.3.1.
----
2019-09-25 20:46:09 UTC - Luke Lu: BTW, we’re getting internal 500 errors doing 
`pulsar-admin topics unload` on _some_ topics. Haven’t you guys seen this 
before?
----
2019-09-25 21:01:26 UTC - Ben Hood: ah, OK, I hadn’t seen that callback API
----
2019-09-25 21:02:31 UTC - Ben Hood: So that would be an alternative to pulling 
in Log4Cxx with all of its dependencies
----
2019-09-25 21:25:50 UTC - Matteo Merli: Yes, you can print and filter the 
messages as you like from the custom logger
----
2019-09-25 21:27:49 UTC - Matteo Merli: @Jesse Zhang (Bose) Posted a fix: 
<https://github.com/apache/pulsar/pull/5276>

This will get released with 2.4.2, which we expect soon.

The issue itself is on server side, so a client upgrade won’t fix that. As a 
general suggestion, you could try to use `negativeAcks` when the processing of 
a message fails, instead of relying on the ack timeout, that might work around 
the issue as well.
----
2019-09-25 21:37:30 UTC - Matteo Merli: @Luke Lu a 500 error should have a 
corresponding exception with stack trace in the broker. Can you get that and 
open an issue?
----
2019-09-25 22:53:17 UTC - tuteng: KeyValue schema appears largely because it is 
compatible with kafka, and debezium is deeply dependent on kafka. Currently, 
python clients do not support KeyValue schema. If KeyValue schema is supported 
in the future, it will be more convenient to do this.
----
2019-09-25 22:55:58 UTC - Ben Hood: Turns out that it is quite simple to 
control the logging with your own function pointer:
----
2019-09-25 22:56:18 UTC - Ben Hood: ```void log_callback(pulsar_logger_level_t 
level, const char *file, int line, const char *message, void *ctx)
{
    printf("Log callback %d %s:%d %s\n", (int)level, file, line, message);
}```
----
2019-09-25 22:58:55 UTC - Ben Hood: And then register this with the client 
object:
----
2019-09-25 22:59:01 UTC - Ben Hood: 
```pulsar_client_configuration_set_logger(conf, &amp;log_callback, NULL);```
----
2019-09-25 23:00:39 UTC - Ben Hood: I’m assuming the `void *ctx` is there to 
supply some kind of app specific kind context, if required by the callback
----
2019-09-25 23:02:24 UTC - Matteo Merli: correct, it will be passed back each 
time when the callback is invoked, it’s completely app specific
----
2019-09-25 23:09:43 UTC - Ben Hood: Makes complete sense - feels like this API 
design allows flexibility in either direction - either with Log4cxx or your own 
ditty
----
2019-09-25 23:10:03 UTC - Ben Hood: Many thanks for pointing this out to me, 
very much appreciated
----
2019-09-25 23:21:38 UTC - Matteo Merli: :+1:
----
2019-09-25 23:24:30 UTC - Jesse Zhang (Bose): @Matteo Merli thanks for the 
update.   From release notes, `negativeAcks` is introduced in 2.4.x,  is it 
actually available in 2.3.1 which we are running?
----
2019-09-25 23:45:18 UTC - Matteo Merli: You should be able to use it with a 2.4 
client
----
2019-09-26 00:01:44 UTC - Karthik Ramasamy: Slides of serverless tutorial in 
the context of Apache Pulsar and Pulsar functions are available at
----
2019-09-26 00:01:45 UTC - Karthik Ramasamy: 
<https://www.slideshare.net/arunkejariwal/serverless-streaming-architectures-and-algorithms-for-the-enterprise-175954094>
----
2019-09-26 00:02:08 UTC - Karthik Ramasamy: This was presented at Strata Data 
Conference, New York
heart_eyes : Poule, Ali Ahmed
----
2019-09-26 01:13:38 UTC - zhchxiao: @zhchxiao has joined the channel
----
2019-09-26 08:47:48 UTC - jia zhai: FYI. Slices for today’s talk in Strata NY 
is here:
<https://www.slideshare.net/streamnative/how-orange-financial-combat-financial-frauds-over-50m-transactions-a-day-using-apache-pulsar-176284080>
----

Reply via email to