2020-07-27 10:29:32 UTC - Giacomo Porro: @Giacomo Porro has joined the channel ---- 2020-07-27 11:35:11 UTC - Giacomo Porro: Hi everyone, first of all let me express my appreciation for this project, really really cool stuff! I don't know if this is the right place to ask, but here's my question: I am trying to use the BACKWARD_COMPATIBILITY schema check strategy that forces me to update my consumers first, then my producers. I found the flow chart on this page on the pulsar website <https://pulsar.apache.org/docs/en/schema-understand/>. Thing is: given an already existing schema on a certain topic which my consumer is subscribed to, when I try to update it by deploying the consumer with the new schema, Pulsar raises an exception like this one "Exception: Pulsar error: IncompatibleSchema". I checked all my configurations and according to the flow chart the schema should be updated, what am I doing wrong?
Thanks a lot folks! P.S. I am using pulsar v2.6.0 with the python client v2.6.0 as well ---- 2020-07-27 13:41:00 UTC - Roy Tarantino: @Roy Tarantino has joined the channel ---- 2020-07-27 16:39:06 UTC - Addison Higham: @Walter how many ledgers are left? If you had ledgers that were only available on a single bookie, then the recovery process will never finish. You can use this command <https://bookkeeper.apache.org/docs/4.10.0/reference/cli/#bookkeeper-shell-listunderreplicated> to see the list of under replicated ledgers, then you can this command: <https://bookkeeper.apache.org/docs/4.10.0/reference/cli/#bookkeeper-shell-ledgermetadata> on the remaining ledgers to see details of what the ensemble size was. If the ensemble size is 1, then you likely have lost the ledger and can delete the ledger (using <https://bookkeeper.apache.org/docs/4.10.0/reference/cli/#bookkeeper-shell-deleteledger>) to get back to all ledgers being replicated ---- 2020-07-27 16:46:28 UTC - alex kurtser: Hi everyone. May i know which zookeeper and bookkeeper versions the pulsar 2.6.0 is using ? ---- 2020-07-27 16:50:32 UTC - Varghese C: @Shivji Kumar Jha lets get this started please! :slightly_smiling_face: +1 : Shivji Kumar Jha ---- 2020-07-27 16:54:11 UTC - Shivji Kumar Jha: I am really sorry for the delay, but i am happy to be reminded :slightly_smiling_face: Got busy securing our pulsar cluster and this slipped my mind... ---- 2020-07-27 16:55:12 UTC - Addison Higham: <https://github.com/apache/pulsar/blob/v2.6.0/pom.xml#L157> <- you can always find current versions in the top level pom file ---- 2020-07-27 16:55:58 UTC - alex kurtser: :+1: ---- 2020-07-27 16:56:02 UTC - alex kurtser: thanks ---- 2020-07-27 16:56:36 UTC - Addison Higham: np :slightly_smiling_face: ---- 2020-07-27 18:04:15 UTC - Varghese C: Thank you! ---- 2020-07-27 18:07:05 UTC - Ryan: Interested in your thoughts: Has anyone considered decoupling message content (BLOB) data from Pulsar messages, storing the BLOB data in an external store/repository and simply storing URIs/pointers to the BLOB data in the Pulsar messages, then lazy-loading the BLOB data on-demand when the client needs to retrieve it? My thought is to use Bookkeeper for BLOB storage, such as done at <https://github.com/diennea/blobit>. This could eliminate complicated existing Pulsar large BLOB chunking strategies (e.g. subscription limitations, transaction support, etc.), reduces overall network usage from transmitting BLOB data unnecessarily and ensures Pulsar messages remain lightweight. ---- 2020-07-27 18:34:55 UTC - Addison Higham: I think that is a fairly common pattern, called the "claim check" pattern, this doc <https://docs.microsoft.com/en-us/azure/architecture/patterns/claim-check> talks about it more. I implemented that same idea previously using S3 for the objects. As far as using bookkeeper, I think that makes sense but does have a trade-off in that your client must now be able to communicate directly with bookkeeper. For that reason, I think it might be a challenge to standardize that, as it won't work for all situations What might be an interesting discussion is to see if it is a pattern common enough to include direct support in the client for offloading and "re-hydrating" large messages. There actually is support already for something that can do that, via "interceptors", see <http://pulsar.apache.org/api/client/org/apache/pulsar/client/api/ProducerInterceptor.html> and <http://pulsar.apache.org/api/client/org/apache/pulsar/client/api/ConsumerInterceptor.html> +1 : Ryan ---- 2020-07-27 18:45:22 UTC - Ryan: Yes, the claim check pattern, thank you. I didn't know about the Interceptors, very interesting, definitely will have to dig further. My thoughts on supporting Bookkeeper, as a primary option (S3 makes sense too), is because Pulsar already uses Bookkeeper so there would not be any additional infrastructure to support (in non-cloud environments). If the Interceptors work as you describe, then you could leverage claim check capabilities transparently, via a "lazy-load" configuration flag. From the producer/consumer perspective, it should be transparent. Lazy vs. non-lazy load would be configurable per use case. ---- 2020-07-27 20:29:25 UTC - Bre Gielissen: @Bre Gielissen has joined the channel ---- 2020-07-27 22:25:32 UTC - Kalyn Coose: Hey all, what would be a typical use case for message chunking in Pulsar? ---- 2020-07-28 00:39:55 UTC - Thomas O'Neill: @Thomas O'Neill has joined the channel ---- 2020-07-28 02:54:27 UTC - Ryan: Messages larger than 5MB that you intend to send through Pulsar, because you do not have an alternative means of storage (e.g. S3, NFS, etc.) or your architecture/use cases do not support external storage access. ---- 2020-07-28 04:44:34 UTC - Takahiro Hozumi: Hi, I have updated a pulsar cluster(5 node of zk, bookie and broker) from 2.5.0, which seems to have a problem of compaction( `<https://github.com/apache/pulsar/issues/6173>` ), to 2.5.2 just now. I have a topic of 300GB retained data, which have many duplicated keys. After updating, a compaction of the topic have started. And I noticed that brokers became unstable maybe due to the load of compaction. The brokers disapper and appear repeatedly in the results of `bin/pulsar-admin brokers list mycluster` . I think that this unstablity is okay if this is only one time problem, but I am concering that compaction always affect brokers availability. it is helpful if a compaction load will be limited to predictable degree without manual operation. ---- 2020-07-28 04:56:27 UTC - Sijie Guo: Did you see another behaviors beside the broker disappear and appear in `bin/pulsar-admin brokers list` result? ---- 2020-07-28 05:02:37 UTC - Takahiro Hozumi: The http service of `brokers:8080` has been down repeatedly. And here is a result of top command of a node on which a zk, bookie and broker containers are. ---- 2020-07-28 05:39:14 UTC - Luke Stephenson: Thanks @xiaolong.ran. Looking forward to it ---- 2020-07-28 05:44:56 UTC - Kadoi Takemaru: @Kadoi Takemaru has joined the channel ---- 2020-07-28 07:06:36 UTC - Sijie Guo: did broker restart? ---- 2020-07-28 07:11:25 UTC - Takahiro Hozumi: Yes, according to docker status. ---- 2020-07-28 08:13:28 UTC - Takahiro Hozumi: And I noticed that the `msgBacklog` of `__compaction` subscription has not changed. Is compaction being processed? I'm thinking that the restart might keep reseting compaction progress over and over again. ----
