2020-05-28 12:03:11 UTC - Sébastien de Melo: Thanks, we can because we are not 
using functions, we'll try that
----
2020-05-28 12:19:57 UTC - Laurent Chriqui: How about a procedure to recover a 
failed bookie ? in case something else happens...
----
2020-05-28 12:22:20 UTC - Sébastien de Melo: @Sijie Guo
+1 : Laurent Chriqui
----
2020-05-28 17:24:08 UTC - Sijie Guo: @Laurent Chriqui - What does “a failed 
bookie” mean here?

If it is a bookie that is gone, bookie autorecovery will automatically 
re-replicate data to ensure ledgers meet the persistence requirements. You can 
also run `bin/bookkeeper shell recover` or `bin/bookkeeper shell decommission` 
to do so.
----
2020-05-28 18:15:41 UTC - Alexander Ursu: Hi, I'm trying out the websocket 
interface, and noticed that the example code on the official docs is Python 2 
only. Is there an example with Python 3 anywhere? I tried to convert it over 
but am having issues with encoding of bytes in json objects which I think has 
changed since Python 2
----
2020-05-28 18:33:20 UTC - Curtis Cook: i think you just need to convert to a 
string (if you actually have a dict) and encode in utf-8 ?
----
2020-05-28 18:33:36 UTC - Curtis Cook: the non-websocket code does this behind 
the scenes
----
2020-05-28 18:39:55 UTC - Alexander Ursu: I tried encoding the strings to 
utf-8, before they go through base64 encoding, but Python complains that a 
value in a JSON object cannot be of type bytes
----
2020-05-28 18:57:29 UTC - Sijie Guo: There are a couple of workarounds

1. In the bookkeeper k8s service definition, you can use 
`publishNotReadyAddresses: true`. Since bookkeeper is a statefulset service. So 
it will always try to read the data from the pods that store it. It is okay to 
publish not-ready address. This will solve the NPE issue you saw.
2. In a related issue, brokers will take ~15 minutes to detect TCP connection 
disconnected after bookkeeper pods restarted. Because a) bookkeeper uses 
hostname (pod name in k8s context) to maintain connection pool; b) the default 
TCP keepalive settings is high so it usually takes more than 15 minutes to 
detect tcp connection disconnected. So in this case, try to tune the tcp 
keepalive settings in your broker pods.
----
2020-05-28 18:58:42 UTC - Sijie Guo: Do you use sendAsync or send to produce 
messages?
----
2020-05-28 19:25:10 UTC - Sébastien de Melo: Hi @Sijie Guo, Laurent meant in 
case of this kind of problems.  We have an autorecovery pod always running and 
also tried the recover subcommand, without success.  It completed correctly but 
the problem was still present.  We did not try the decommission one though.
Is it a bug or have we done something wrong?
In addition, is it possible to launch an empty bookie and force a replication?  
If so, how?  Is storage expansion needed for that to work?
----
2020-05-28 20:19:13 UTC - Adelina Brask: So, I can't seem to move on with 
pulsar sources and sinks. I have a 3 node cluster with powerful machines, and I 
am using Logstash to send json to Netty source - Elastic sink to Elastic. I am 
keep getting         `"exceptionString" : "Elasticsearch exception 
[type=mapper_parsing_exception, reason=failed to parse field [the_real_deal] of 
type [text] in document with id 'R-rqXHIBp3FZZNCr9L8R'. Preview of field's 
value: '']",`  for all felts in random order. But none of the felts are even 
empty. I have check by sending the same data from Logstash to Kafka to Elastic. 
I have also checked by packing the hole event into 1 string felt 
"the_real_deal".  I don't understand why Pulsar is not parsing the text? The 
text as found in the topic is: 
`{"the_real_deal":"{\"@version\":\"1\",\"host\":\"10.220.37.11\",\"timestamp\":\"2020-05-28T19:44:33,182Z\",\"node.id\":\"IA4phyHvRTaTh_dNrdX1QA\",\"message\":\"org.elasticsearch.xpack.security.audit
 action=\\\"indices:data/write/index:op_type/index\\\" 
event.action=\\\"access_granted\\\" event.type=\\\"transport\\\" 
indices=\\\"[\\\"default_sheplog_systems_elasticsearch-weekly.2020.w22\\\"]\\\" 
node.id=\\\"IA4phyHvRTaTh_dNrdX1QA\\\" 
origin.address=\\\"10.220.37.51:52538\\\" origin.type=\\\"rest\\\" 
request.id=\\\"ULF15RZhSwOD11Ie_GaIuA\\\" request.name=\\\"BulkItemRequest\\\" 
user.name=\\\"logstash_internal\\\" user.realm=\\\"native1\\\" 
user.roles=\\\"[\\\"logstash_internal\\\"]\\\"\",\"original_size\":949,\"type\":\"server\",\"level\":\"INFO\",\"cluster.uuid\":\"664ugVCVRzmPzVeJ3kcPLg\",\"cluster.name\":\"Elasticsearch\",\"node.name\":\"clstelastic01.dmz23.local\",\"date\":1.590695073201703E9,\"tags\":[\"default_sheplog_systems_elasticsearch\"],\"component\":\"o.e.x.s.a.l.LoggingAuditTrail\",\"uuid\":\"4b6a6e01-d27f-4572-b30c-c8c59e9813f8\",\"@timestamp\":\"2020-05-28T19:44:35.578Z\"}","tags":["default_sheplog_systems_elasticsearch"]}`
----
2020-05-28 21:01:49 UTC - Alexander Ursu: Hi, I'm using the Pulsar SQL plugin 
for Presto, but I'm noticing that sometimes queries don't receive the latest 
messages from a topic.

The topic in question has a very slow rate of ingestion, and we intend to keep 
infinite retention especially with tiered storage.

I notice that when the topic is queried for all messages, the latest don't 
appear, but they are actually available for consumption in other means.

Once newer messages come in, the ones from before finally become available. Not 
sure if this is an issue with the plugin, or Pulsar in general. This is on 
version 2.5.1
----
2020-05-28 21:06:49 UTC - Alexander Ursu: Could this is any way be related to 
`pulsar.max-entry-read-batch-size` in the `pulsar.properties` config file for 
the catalog?
----
2020-05-29 00:15:16 UTC - Curtis Cook: so the payload is a base64 encoded string
----
2020-05-29 00:15:28 UTC - Curtis Cook: assuming you have some dict foo
----
2020-05-29 00:18:36 UTC - Curtis Cook: ```foo = {'key': 'value'}
msg = {
  "payload": base64.b64encode( json.dumps(foo).encode('utf-8') )
}```
----
2020-05-29 06:17:32 UTC - Lawal Azeez: @Lawal Azeez has joined the channel
----
2020-05-29 06:42:12 UTC - Lawal Azeez: Hello house.
I currently use Nodejs client in my project. but my surprise is that anytime i 
have more than 3 subscriptions, pulsar producer will stop working. the code 
snippet below will never execute again for all producers.
```const producer = await this.client.createProducer({
      topic: 'my-topic'
    });```
it doesn't matter wether I subscribe to the same topic or not. everything work 
fine until the subscription exceed 3. Tried the same process with C# client and 
everything seem fine. so i am confuse what the issue might be.
#please help
----
2020-05-29 07:11:53 UTC - Sijie Guo: I don’t think Pulsar connector will parse 
the text. I think it will just pass the text received in netty source to 
elastic sink. Have you tried to inspect the data in the output topic of netty 
source?
----
2020-05-29 07:13:30 UTC - Sijie Guo: @Alexander Ursu this is related to Pulsar 
SQL bypassing brokers to read the data from bookkeeper directly. The 
LastAddConfirmed for an active ledger is only advanced with next entry or next 
explicit lac commit. I think there is an open issue for this that we need to 
enable a flag in pulsar broker and fixes in presto connector.
----
2020-05-29 07:13:30 UTC - Adelina Brask: Yes. And the data seems complete. The 
felt is never empty. I can't understand why
----
2020-05-29 07:14:04 UTC - Sijie Guo: Can you please show your code snippest?
----
2020-05-29 07:14:33 UTC - Sijie Guo: Did you have the original data without 
packing event into 1 string?
----
2020-05-29 07:27:54 UTC - Lawal Azeez: @Sijie Guo thanks for trying to help
----
2020-05-29 08:26:50 UTC - Goose: @Goose has joined the channel
----
2020-05-29 08:28:33 UTC - Adelina Brask: 
`{"date":1590737041.195320,"type":"server","timestamp":"2020-05-29T07:24:01,003Z","level":"INFO","component":"o.e.x.s.a.l.LoggingAuditTrail","cluster.name":"Elasticsearch","node.name":"clstelastic01.dmz23.local","message":"org.elasticsearch.xpack.security.audit
 action=\"indices:data/write/index:op_type/index\" 
event.action=\"access_granted\" event.type=\"transport\" 
indices=\"[\"default_sheplog_systems_elasticsearch-weekly.2020.w22\"]\" 
node.id=\"IA4phyHvRTaTh_dNrdX1QA\" origin.address=\"10.220.37.51:44588\" 
origin.type=\"rest\" request.id=\"pE_t9rboQv2iw_KhnFRcrg\" 
request.name=\"BulkItemRequest\" user.name=\"logstash_internal\" 
user.realm=\"native1\" 
user.roles=\"[\"logstash_internal\"]\"","cluster.uuid":"664ugVCVRzmPzVeJ3kcPLg","node.id":"IA4phyHvRTaTh_dNrdX1QA"}`
----
2020-05-29 08:31:38 UTC - Adelina Brask: the error is then a random parameter 
(tag, @timestamp, node.id) `Preview of field's value: ''] .` Those paramters 
are never empty, as they are stamped in our Logstash for each event.
----
2020-05-29 09:08:31 UTC - lujop: @lujop has joined the channel
----

Reply via email to