2020-05-21 10:05:18 UTC - fenghao007: @fenghao007 has joined the channel ---- 2020-05-21 11:51:47 UTC - Ermir Zaimi: hi, i configured pulsar standalone with JWT according to documentation. but we get http 401 unauthorized on strating the standalone pulsar service. any suggestions ---- 2020-05-21 12:20:53 UTC - Raman Gupta: @Patrik Kleindl I don't believe its about treating Slack as a "free support" channel. In a healthy Slack community, the community owner(s) get as much out of the interactions as users do, which is also why I don't believe scaling up here for Pulsar is that big a challenge. As one counter-example, the Kotlin Slack community is far far larger than the Confluent Slack community, and it generally works amazingly well. Don't forget, its Confluent that is raising "size of community" as a comparison point in their favor. My point is that "health of community" != "size of community". ---- 2020-05-21 12:51:45 UTC - Patrik Kleindl: @Raman Gupta As I said I am not a Confluent employee nor do I agree with everything they say or do. I just do not agree with your statement about the Kafka community. The scalability issue I see is that currently the most senior Pulsar developers have time to do community work which is great but kinda limited as soon as they have to do paid work elsewhere or the volume simply exceeds their capacity. ---- 2020-05-21 12:55:34 UTC - Raman Gupta: And like I said, community is not just about the unwashed masses begging for help from a few senior devs. Lets agree to disagree. ---- 2020-05-21 13:19:08 UTC - Hiroyuki Yamada: Hi, I’ve asked a question before about backup in Pulsar and knew there was no backup solution except for (Auto) Recovery. I really feel we need some snapshot of bookie data as discussed <https://github.com/apache/pulsar/issues/4942|here> and I’m wondering backing up (for example doing `rsync` ) closed ledgers could be a feasible solution. Does anyone know about this ? I’m also wondering how people run and operate Pulsar in production without backup. What if a disk of one of the nodes is broken and needs to be replaced ? It would be great if anyone can help me. Thanks. ---- 2020-05-21 14:24:58 UTC - Deepa: Hi @Sijie Guo,
As mentioned above, even without changing any properties on keepAlive (both and broker and client side), connections get closed after 60 seconds and a new connection is established automatically and it is used for futher produce/consume messages. Is there an option to keep the connections alive for a given time period, as in JMS? (I see this happening only when i pause above program with debug mode). Attaching the program used and log here ---- 2020-05-21 16:03:20 UTC - Sijie Guo: If you pause the program, it means that JVM will stop. It means the client will stop sending keep-alive messages. ---- 2020-05-21 16:04:58 UTC - Sijie Guo: Did you see a problem if you didn’t pause the program? ---- 2020-05-21 16:48:36 UTC - David Kjerrumgaard: I have a working standalone Docker image here with JWT security enabled that you can review to your configuration. <https://github.com/david-streamlio/pulsar-in-action/tree/master/docker-images/pulsar-standalone> ---- 2020-05-21 16:49:53 UTC - David Kjerrumgaard: Data is automatically replicated inside the BookKeeper layer itself. Therefore, you have multiple copies of the same data available even in the event of disk or even Bookie failure. ---- 2020-05-21 16:51:08 UTC - David Kjerrumgaard: <https://pulsar.apache.org/docs/en/concepts-architecture-overview/#ledgers>. "A ledger is an append-only data structure with a single writer that is assigned to multiple BookKeeper storage nodes, or bookies. Ledger entries are replicated to multiple bookies." ---- 2020-05-21 16:52:33 UTC - Ermir Zaimi: thanks i will look at it ---- 2020-05-21 17:41:02 UTC - Matt Mitchell: I’m investigating a way to implement request/reply using Pulsar, where a producer sends a request and consumers are subscribed via an exclusive subscription (only 1 consumer needs to “reply”). Right now, a consumer replies to a “replies” topic and subscribers of that topic do so via Shared subscription (every reply consumer receives the reply). What I’d like to do instead, is change the reply behavior so that the server/jvm that sent the request, is the only reply consumer that receives the reply. Is that possible? Are there any examples of the request/reply pattern implemented using Pulsar? ---- 2020-05-21 17:52:53 UTC - Addison Higham: two ideas: 1. if you don't have that many unique hosts/processes/etc, it may not be that insane to have a reply topic per host/processes. Topics in pulsar are *fairly* cheap, having a few thousand really isn't too big of a deal. Topics can also be transient, if there aren't active subscriptions/producers and no retained messages, topics can be configured to be automatically deleted. AFAIK, this is pretty much how this works in rabbitmq with transient reply topics. If you have potentially tens of thousands of topics this may get scary 2. there was talk recently (not finding it right away) of implementing a server side filter of keys for subscriptions. That would allow you to have a single reply *topic* but an exclusive subscription filtered just to a given key. I think that would be ideal, but once again, not yet implemented. ---- 2020-05-21 17:56:13 UTC - David Kjerrumgaard: You could have the producers each subscribe to their own second "control" topic, e.g. `<persistent://my-tenant/my-ns/producer-1-control-topic>` Then you have each producer embed the name of their control topic inside of the message properties that they send. The consumer can read the properties to get the control topic name and publish a response directly to that topic (which only that particular producer is subscribed to) ---- 2020-05-21 17:57:25 UTC - David Kjerrumgaard: That is more of the "return address" pattern, but I think it would meet your requirements (if I understand them correctly) ---- 2020-05-21 18:27:54 UTC - Matt Mitchell: Ok that sounds very straight forward (each requesting server subscribing to its own dedicated topic). This system will never have more than 10-20 requesting servers, so it should work fine. Time to give it a try. Thank you both! ---- 2020-05-21 18:29:53 UTC - Matt Mitchell: Actually, one question… how do I configure topics to be transient? ---- 2020-05-21 19:06:57 UTC - Franck Schmidlin: This blog from @Kirill Merkushev has proven super useful in writing automated integration tests around my first pulsar function. Thank you! <https://lanwen.ru/posts/pulsar-functions-how-to-debug-with-testcontainers/|https://lanwen.ru/posts/pulsar-functions-how-to-debug-with-testcontainers/> slightly_smiling_face : Patrik Kleindl, Kirill Merkushev, David Kjerrumgaard clap : David Kjerrumgaard, Karthik Ramasamy ok_hand : Konstantinos Papalias ---- 2020-05-21 20:05:37 UTC - Kirill Merkushev: Glad to be helpful! :) ---- 2020-05-21 20:44:14 UTC - David Kjerrumgaard: @Matt Mitchell what do you mean by transient? ---- 2020-05-21 21:04:17 UTC - Matt Mitchell: Base on what base on what @Addison Higham mentioned here: <https://apache-pulsar.slack.com/archives/C5Z4T36F7/p1590083573316300> - curious to know how topics can be auto-deleted if there are no active consumers/producers. ---- 2020-05-21 21:07:24 UTC - Jeff Schneller: Completely new to Pulsar myself but could the producer put the topic that the consumer should respond to in the message? The topic name could be a guid so it is unique. Then the consumer does what it needs to do an replys to the topic that the producer said to reply on. You would need to do some topic cleanup or auto-delete if no messages over a certain period of time ( I think that is a possibility). ---- 2020-05-21 21:12:00 UTC - David Kjerrumgaard: Ah, so configuring the topics to get deleted if no subscriptions are available? ---- 2020-05-21 21:14:06 UTC - David Kjerrumgaard: The default behavior is to delete topics without any data or active subscriptions, and is controlled by the `brokerDeleteInactiveTopicsEnabled=false` property in the broker.conf file. ---- 2020-05-21 21:14:22 UTC - Patrik Kleindl: @David Kjerrumgaard Thanks for the <http://lenses.io|lenses.io> article, but from what I understand the problems described would not change with Pulsar. A complex processing topology is a challenge, and Pulsar Functions are not comparable to Kafka Streams. ---- 2020-05-21 21:16:41 UTC - David Kjerrumgaard: Sure, I just intended to demonstrate that there are several "horror" stories with Kafka out there that can be used to counter the "Kafka" is easy narrative that Confluent is spreading. Having a long history is a double-edged sword. It make your product more "mature" but also exposes more weaknesses over time ---- 2020-05-21 21:17:23 UTC - David Kjerrumgaard: "It got to the point where the CEO would be asking whether it was a Kafka issue every time there was a problem with the data flow. “In 99% of the cases, the answer was yes,” Schipka said." my favorite quote ---- 2020-05-21 21:19:43 UTC - Raman Gupta: However, they then went on to say that Lenses helped them figure out they were doing it wrong in the first place, so I'm not sure that in particular is the best example. That being said, its pretty much the same story at our startup with Kafka. +1 : David Kjerrumgaard ---- 2020-05-21 21:19:55 UTC - David Kjerrumgaard: While <http://lenses.io|lenses.io> solved this particular case, there are dozens of R/T pipelines with similar struggles due to Kafka. There solution was to go with a managed service offering because Kafka was too complex to manage themselves, and use K8s. ---- 2020-05-21 21:21:25 UTC - David Kjerrumgaard: Definitely not the best generic example, but one that a lot of people can relate to on some level when working with Kafka which is has reliability and scalability issues. +1 : Raman Gupta ---- 2020-05-21 21:24:24 UTC - Raman Gupta: They talked about seeing how complex Kafka streams made their topology through lenses. It was a bit naiive on their part. Didn't they see the tens of intermediate topics Kafka creates for any mildly complex stream? Its kind of crazy. ---- 2020-05-21 21:25:07 UTC - Raman Gupta: This tool is great for seeing how ridiculous things get: <https://zz85.github.io/kafka-streams-viz/> grinning : David Kjerrumgaard ---- 2020-05-21 21:28:56 UTC - Patrik Kleindl: We had anvery similar situation, but our problem was not with Kafka as a platform which was really stable but the complexity of stream processing which I doubt is much better in other tools. And yes, what Raman just mentioned helps to visualize things. Better visibility than that is usually the domain of commercial products. ---- 2020-05-21 21:30:31 UTC - Raman Gupta: I've had tonnes of issues with Kafka as a platform. I believe I've reported close to 10 issues to the Kafka project in the last year. ---- 2020-05-21 21:31:11 UTC - Tanner Nilsson: I'm trying to create a python function using the REST API, but I can't figure out how to do it and send a local `.py` file the way you can with pulsar-admin.... ---- 2020-05-21 21:31:14 UTC - Tanner Nilsson: With pulsar-admin I would do ```bin/pulsar-admin functions create \ --tenant <tenant> \ --namespace <namespace> \ --name <function_name> \ --py <path_to_py_or_zip> \ --className <className> --inputs <inputs> \ --output <output>``` with the REST API, I've done it with ```curl -X POST \ -H "Authorization: Bearer <token>" \ -F functionConfig='{ "tenant":"<tenant>", "namespace":"<namespace>", "className":"<className>", "runtime":"PYTHON", "inputs":"<inputs>", "output":"<output>};type=application/json' \ -F url='http://<url_to_file>;type=application/text' \ http://<pulsar_host>/admin/v3/functions/<tenant>/<namespace>/<function_name>``` but that only works if you can provide a URL where the file can be downloaded. Can the REST API be used for a local file (local to where the POST is originating, not on the broker/function-worker)? ---- 2020-05-21 21:31:54 UTC - Raman Gupta: They've fixed quite a few of them, and any complex system has bugs, but still, dealing with its quirks and odd behaviors, even short of bugs, is not easy. ---- 2020-05-21 21:32:02 UTC - Patrik Kleindl: We had questions like from the CEO above, and of course the first suspect was usually Kafka. Turned out more often it was misconfigured OSes, triple mirrored storage systems and lots of other reasons. ---- 2020-05-21 21:33:01 UTC - Raman Gupta: Isn't that the point though @Patrik Kleindl? Any system that requires that level of dedication to its infrastructure can't claim to be easy to manage. ---- 2020-05-21 21:33:44 UTC - Patrik Kleindl: I have reported issues and helped fix some of them too. It‘s still a community project :wink: ---- 2020-05-21 21:34:10 UTC - Raman Gupta: I think I remember seeing a comment from you on one of my issues, IIRC :slightly_smiling_face: ---- 2020-05-21 21:36:28 UTC - Raman Gupta: IMO, a tool like Pulsar/Kafka should strive to fail fast or demonstrate poor performance in the face of infrastructure issues or misconfigurations. Kafka unfortunately more often than not just blows up in weird and crazy ways. ---- 2020-05-21 21:36:48 UTC - Patrik Kleindl: I bet Pulsar or BK can be wrecked by the same things. There‘s no free lunch. ---- 2020-05-21 21:37:10 UTC - David Kjerrumgaard: Well, I am glad you are using your experience to help the Pulsar community ! ---- 2020-05-21 21:37:22 UTC - Raman Gupta: I haven't used Pulsar as much as Kafka, but so far its been rock-solid in comparison to Kafka. ---- 2020-05-21 21:37:40 UTC - Raman Gupta: (Once the initial setup was done, which admittedly, wasn't easy) ---- 2020-05-21 21:38:08 UTC - David Kjerrumgaard: Helm chart didn't work? ^^^ ---- 2020-05-21 21:38:37 UTC - Raman Gupta: I had issues with the initial k8s setup but I was using the obsolete templates in the pulsar repo, not the helm chart. ---- 2020-05-21 21:39:37 UTC - Raman Gupta: I had one issue the other day that I thought was Pulsar's fault, but it turned out, no, a Kafka consumer reset its offsets for no particular reason and wrote a bunch of stuff to Pulsar that it shouldn't have. ---- 2020-05-21 21:39:52 UTC - Patrik Kleindl: @David Kjerrumgaard There are still so many companies which run on prem and without k8s. And running k8s without dedication won‘t help wirh Pulsar or Kafka :upside_down_face: ---- 2020-05-21 21:40:34 UTC - Raman Gupta: I run both Pulsar and Kafka on k8s (with dedication), so I'm comparing apples to apples). ---- 2020-05-21 21:43:13 UTC - David Kjerrumgaard: @Patrik Kleindl While that is definitely true, the overwhelming trend I am seeing these days is to migrate as much as possible to the cloud. Even the traditional on-prem software vendors have moved to cloud-based offerings due to customer demand. ---- 2020-05-21 21:51:44 UTC - Patrik Kleindl: The vendors yes, but at least here in Europe customer adoption is slow. ---- 2020-05-21 21:59:23 UTC - David Kjerrumgaard: Why is that? ---- 2020-05-21 22:01:03 UTC - David Kjerrumgaard: There are also a lot of BYOK8s solutions now as well that allow you to run your own K8s environment , such as <https://gravitational.com/gravity/docs/>. ---- 2020-05-21 22:03:51 UTC - Patrik Kleindl: FUD regarding the cloud, mainly from GDPR and corporate policies having your data with american companies and lots of old-school on-prem operations with reluctance to change. And it doesn‘t help if you only run your streaming infrastructure in the cloud, your applications and other stuff have to move too. ---- 2020-05-21 22:23:37 UTC - Greg Methvin: Did you try `-F <mailto:'[email protected]|'[email protected]>'`? ---- 2020-05-21 22:24:59 UTC - Luke Stephenson: @Matteo Merli Thanks for looking into this. Here are the broker logs during startup. ---- 2020-05-21 22:25:35 UTC - Matteo Merli: There you go :slightly_smiling_face: ```2020-05-21T06:21:19.620Z,i-0f0dcca57497394e9,pulsar-all,[conf/broker.conf] Applying config managedLedgerDefaultWriteQuorum = 3 2020-05-21T06:21:19.620Z,i-0f0dcca57497394e9,pulsar-all,[conf/broker.conf] Applying config managedLedgerDefaultEnsembleSize = 3 2020-05-21T06:21:19.620Z,i-0f0dcca57497394e9,pulsar-all,[conf/broker.conf] Applying config managedLedgerDefaultAckQuorum = 2``` ---- 2020-05-21 22:25:54 UTC - Matteo Merli: `managedLedgerDefaultWriteQuorum = 3` and `managedLedgerDefaultAckQuorum = 2` ---- 2020-05-21 22:26:19 UTC - Matteo Merli: that's what I was suspecting ---- 2020-05-21 22:26:59 UTC - Greg Methvin: Interestingly the functions API doesn’t appear to be documented here: <http://pulsar.apache.org/admin-rest-api/?version=2.5.1> ---- 2020-05-21 22:27:05 UTC - Matteo Merli: I'd suggest to change ensembleSize and writeQuorum to 2, matching the ack quorum ---- 2020-05-21 22:27:36 UTC - Greg Methvin: I’m not sure how I figured out the parameter I needed was named `data`, but I have code that uses that name so I guess that’s right. 100 : Tanner Nilsson ---- 2020-05-21 22:27:42 UTC - Matteo Merli: that will avoid the BK client to accumulate messages in memory, when 1 of the bookies is slow/timing out ---- 2020-05-21 22:28:58 UTC - Matteo Merli: If these values are the default on the helm chart.. then the helm chart should be fixed ASAP :slightly_smiling_face: ---- 2020-05-21 22:34:50 UTC - Matteo Merli: @Luke Stephenson <https://github.com/apache/pulsar-helm-chart/pull/13> ---- 2020-05-21 23:45:14 UTC - Hiroyuki Yamada: @David Kjerrumgaard Thank you. Sorry my explanation was not enough. Yes, I know the data in bookie is replicated but it is another thing. I’m wondering how easily I can recover bookie node to keep it is fully replicated. For example, 3 replicas in bookie nodes and 1 node is crashed and unfortunately lost all the date, so now there are only 2 replicas. How do you recover to a state where 3 replicas are fully replicated ? I think there are usually multiple ways to do in in distributed data management systems such as; 1. Bring data from other nodes (like Bookie (Auto) recovery) 2. Use backup/snapshot to restore to a certain point then bring data from other nodes. As far as I investigated and used Pulsar so far, option 1 is only supported and option 2 is not. So, my first question is everyone does option 1 for such case ? If there are other options (except for geo-replication since it’s another story again), I would like to know. My second question is, can we utilize closed ledgers as kind of realizing option 2 with the current implementation since ledger data is immutable ? ---- 2020-05-21 23:48:19 UTC - David Kjerrumgaard: I am only aware of people using option #1, and relying on the BookKeeper <https://bookkeeper.apache.org/docs/latest/admin/autorecovery/|auto-recovery feature> to self-heal. +1 : Hiroyuki Yamada man-bowing : Hiroyuki Yamada ---- 2020-05-22 00:31:35 UTC - Luke Stephenson: Thanks. I'll give the config changes in your PR a go ---- 2020-05-22 00:52:04 UTC - Matt Mitchell: Thanks @David Kjerrumgaard! ---- 2020-05-22 01:26:46 UTC - Luke Stephenson: Seems to have made a huge difference to stability. +1 : Matteo Merli ---- 2020-05-22 01:46:43 UTC - Matteo Merli: There are few problems in backing up data for bookies: 1. If the traffic is non-trivial, the backup system needs to be very performant.. as in a "log storage system"... otherwise it might not be able to keep up 2. There are 2 parts to back up: the data and the metadata. It's not easy to taken an "atomic" snapshot of the 2 and reconstruct a consistent view. ---- 2020-05-22 02:41:05 UTC - snowcrumble: @snowcrumble has joined the channel ---- 2020-05-22 02:41:13 UTC - Sijie Guo: <http://pulsar.apache.org/functions-rest-api/?version=2.5.1> ---- 2020-05-22 02:41:28 UTC - Sijie Guo: The function endpoints was separated to a separate swagger file. ---- 2020-05-22 03:53:28 UTC - Hiroyuki Yamada: @David Kjerrumgaard Thank you. @Matteo Merli Thank you for the reply. Ok, backing up ledger immutable data seems not very easy to do due to 2nd problem. Hmm, do we have any plans to support such atomic snapshot ? (it doesn’t seem to be too difficult) Or do you think the current recovery with (auto) recovery is good enough ? ---- 2020-05-22 04:01:03 UTC - Matteo Merli: > Hmm, do we have any plans to support such atomic snapshot ? (it doesn’t seem to be too difficult) Oh, it's a very difficult problem! :slightly_smiling_face: In past we gave a it thought to implement a kind of "rollback" operation to protect against accidental data deletion operations. (eg: rollback a topic to the same exact state where it was 1h ago), though that's a slightly different goal. > Or do you think the current recovery with (auto) recovery is good enough ? To protect against node failures, yes. There are other variants of this approach that we are playing with, though these are more geared towards cloud deployments where disks can be lost more frequently. ---- 2020-05-22 04:34:10 UTC - Hiroyuki Yamada: @Matteo Merli > Oh, it’s a very difficult problem! Oh, excuse me, I wasn’t really sure about it. Since it is a in-node issue, I thought it seems relatively less complex. (like by taking some locks or something, which might kill the performance) Anyways, I got it. Thank you. > To protect against node failures, yes. There are other variants of this approach that we are playing with, though these are more geared towards cloud deployments where disks can be lost more frequently. You mean `node failures` includes node failure due to disk failures ? I’m planning to use it in cloud environment and will store possibly lots of data at least for several years so concerning that auto recovery possibly can’t catch up. ---- 2020-05-22 04:35:34 UTC - Matteo Merli: > Since it is a in-node issue, No, the metadata (pointing to these ledgers) is kept in ZooKeeper. Even if we restore the data in new node, we need to update the metadata to make it point to new node. man-bowing : Hiroyuki Yamada scream : Hiroyuki Yamada ---- 2020-05-22 04:35:57 UTC - Matteo Merli: > I’m planning to use it in cloud environment and will store possibly lots of data at least for several years so concerning that auto recovery possibly can’t catch up. If you use EBS like storage volumes, you shouldn't be worrying about it. ---- 2020-05-22 04:36:44 UTC - Matteo Merli: In the sense that the disk itself is already replicated, so you can always restart a new container/VM and mount that same volume. ---- 2020-05-22 04:42:40 UTC - Hiroyuki Yamada: OK, thank you very much for your support. ---- 2020-05-22 05:56:32 UTC - Deepa: I dont see it when i run it normally with a Thread.sleep(90*1000) for wait. So with this if the client is idle and not producing any messages, the connection is still intack and doesn't get terminated. But if the client goes to hung state, current connection will be terminated at broker and whenever the client comes back a new connection is established automatically (i didn't have to create a new connection or the program didnt terminate, messages were produced using the same client object). Please correct if my understanding is wrong here? ---- 2020-05-22 07:54:01 UTC - VanderChen: I have set keyBasedBatcher as follows but it still doesn't work. ```producer = client.newProducer() .batcherBuilder(BatcherBuilder.KEY_BASED) .enableBatching(true) .topic("my-topic") .create();``` ---- 2020-05-22 08:28:54 UTC - Ken Huang: Hi, I want to do <https://www.splunk.com/en_us/blog/it/geo-replication-in-apache-pulsar-part-1-concepts-and-features.html|synchronous geo-replication>. I set this in the broker ```bookkeeperClientRegionawarePolicyEnabled: "true" bookkeeperClientReorderReadSequenceEnabled: "true"``` Do I need to set-bookie-rack for bookie? ----
