2019-05-28 09:11:07 UTC - Lewey: Has anybody used the api for creating a persistent partitioned topic? the docs dont seem to include the parameter you need to use to specify the amount of partitions and without it i am getting a response of `Therequestentitycannotbeempty.` ---- 2019-05-28 09:14:00 UTC - jia zhai: <https://pulsar.apache.org/docs/en/pulsar-admin/#create-partitioned-topic> @Lewey Is this link? ---- 2019-05-28 09:14:35 UTC - jia zhai: The number of partitions is needed ---- 2019-05-28 09:14:44 UTC - divyasree: i have given as blank for cipher field.... ---- 2019-05-28 09:14:56 UTC - divyasree: now i am getting the old error ---- 2019-05-28 09:15:05 UTC - divyasree: ``` 09:10:51.739 [main] WARN org.apache.pulsar.functions.utils.Actions - Error completing action [ Creating producer for assignment topic <persistent://public/functions/assignments> ] [ATTEMPT] 2/5 09:11:01.742 [pulsar-client-io-41-1] INFO org.apache.pulsar.client.impl.ConnectionPool - [[id: 0x9fcd949b, L:/127.0.0.1:41000 - R:127.0.0.1/127.0.0.1:6650]] Connected to server 09:11:01.744 [pulsar-io-23-3] INFO org.apache.pulsar.broker.service.ServerCnx - New connection from /127.0.0.1:41000 09:11:01.745 [pulsar-io-23-3] WARN org.apache.pulsar.broker.service.ServerCnx - [/127.0.0.1:41000] Unable to authenticate: Unsupported authentication mode: none 09:11:01.745 [pulsar-io-23-3] INFO org.apache.pulsar.broker.service.ServerCnx - Closed connection from /127.0.0.1:41000 09:11:01.745 [pulsar-client-io-41-1] WARN org.apache.pulsar.client.impl.ClientCnx - [127.0.0.1/127.0.0.1:6650] Got exception IllegalArgumentException : null ``` ---- 2019-05-28 09:15:12 UTC - Lewey: its the REST API that i am trying to use, <https://pulsar.apache.org/admin-rest-api/#operation/createPartitionedTopic> ---- 2019-05-28 09:15:25 UTC - divyasree: all the above logs are WARN ---- 2019-05-28 09:15:41 UTC - divyasree: ``` ERROR org.apache.pulsar.functions.worker.SchedulerManager - Exception while at creating producer to topic <persistent://public/functions/assignments> java.util.concurrent.ExecutionException: org.apache.pulsar.client.api.PulsarClientException: Connection already closed at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) ~[?:1.8.0_212] at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915) ~[?:1.8.0_212] at org.apache.pulsar.functions.worker.SchedulerManager.lambda$createProducer$0(SchedulerManager.java:112) ~[org.apache.pulsar-pulsar-functions-worker-2.3.1.jar:2.3.1] ``` ---- 2019-05-28 09:15:50 UTC - divyasree: this is the error actually ---- 2019-05-28 09:16:23 UTC - jia zhai: I c ---- 2019-05-28 09:16:38 UTC - jia zhai: This is been fixing ---- 2019-05-28 09:17:07 UTC - Sijie Guo: @Lewey FYI the rest api was generated by swagger. we missed quite a lot of annotations. we are working on improving them. currently there are bunch of pull requests are working on improving this part. hopefully they will be merged soon. ---- 2019-05-28 09:17:45 UTC - Sijie Guo: which version are you using 2.3.1? ---- 2019-05-28 09:18:09 UTC - Sijie Guo: if you are using 2.3.1, you might need to manually configure the authentication and authorization in functions_worker.yml. ---- 2019-05-28 09:18:41 UTC - Sijie Guo: <http://pulsar.apache.org/docs/en/next/functions-worker/#security-settings> ---- 2019-05-28 09:18:46 UTC - Sijie Guo: checkout this section ---- 2019-05-28 09:18:52 UTC - Lewey: @Sijie Guo Do you know the param to pass to specify the num of partitions? ---- 2019-05-28 09:19:22 UTC - Sijie Guo: or I would suggest you disable function worker first and make sure you configure the authentication and authorization for broker first. ---- 2019-05-28 09:19:56 UTC - Sijie Guo: after it is running okay, then you enable function worker and follow the documentation to configure authentication and authorization for function worker ---- 2019-05-28 09:21:42 UTC - Sijie Guo: @Lewey you can just post with an integer (the number of partitions) ---- 2019-05-28 09:26:38 UTC - divyasree: when i disabled the function worker, i am able to start the broker successfully... ---- 2019-05-28 09:27:02 UTC - divyasree: thanks for the help... ---- 2019-05-28 09:27:22 UTC - divyasree: i am trying to authorize the namespace ---- 2019-05-28 09:27:26 UTC - divyasree: with this command ---- 2019-05-28 09:27:32 UTC - divyasree: ``` bin/pulsar-admin namespaces grant-permission divya-tenant/divya-namespace \ --role test-user \ --actions produce,consume ``` ---- 2019-05-28 09:27:51 UTC - divyasree: but getting ``` HTTP 401 Unauthorized
Reason: HTTP 401 Unauthorized ``` ---- 2019-05-28 09:29:56 UTC - Sijie Guo: is your admin role in `--admin-roles` of tenant `divya-tenant`? ---- 2019-05-28 09:30:15 UTC - divyasree: yes ---- 2019-05-28 09:31:05 UTC - Sijie Guo: bin/pulsar-admin tenants get divya-tenant ---- 2019-05-28 09:31:13 UTC - Sijie Guo: can you run this and show me the result? ---- 2019-05-28 09:31:32 UTC - divyasree: ``` bin/pulsar-admin tenants get divya-tenant HTTP 401 Unauthorized Reason: HTTP 401 Unauthorized ``` ---- 2019-05-28 09:31:47 UTC - Sijie Guo: interesting ---- 2019-05-28 09:31:58 UTC - Sijie Guo: how did you create divya-tenant ? ---- 2019-05-28 09:32:25 UTC - Sijie Guo: did you configure pulsar-admin with the token? ---- 2019-05-28 09:34:45 UTC - divyasree: i created tenant with this command ``` bin/pulsar-admin tenants create divya-tenant --allowes-clusters test-ttc,test ``` ---- 2019-05-28 09:35:16 UTC - divyasree: test-ttc and test are cluster names in different region... ---- 2019-05-28 09:36:09 UTC - divyasree: i didnt configure anything separately... i am following this link for token generation <https://pulsar.apache.org/docs/en/security-token-admin/> ---- 2019-05-28 09:39:39 UTC - Sijie Guo: did you create tenants before configuring authentication or after? ---- 2019-05-28 09:40:15 UTC - Sijie Guo: <https://pulsar.apache.org/docs/en/security-token-client/#cli-tools> did you configure your pulsar admin to use token authentication? ---- 2019-05-28 09:40:48 UTC - divyasree: i created tenant before configuring authentication ---- 2019-05-28 09:41:06 UTC - Sijie Guo: I see. ---- 2019-05-28 09:42:04 UTC - Sijie Guo: so you need to configure pulsar-admin. in order to change the tenant setting or create a new tenant , your role has to be a super user ---- 2019-05-28 09:45:48 UTC - Lewey: cheers beers : Sijie Guo ---- 2019-05-28 10:26:45 UTC - divyasree: configure pulsar-admin means... changes in client.conf u mean? ---- 2019-05-28 12:11:41 UTC - Sijie Guo: Yes correct ---- 2019-05-28 14:10:29 UTC - Alexandre DUVAL: Hi, when I create tenant using restapi, how should be passed the tenantInfo parameter, do you have an example ? (allowed cluster, adminRoles) ---- 2019-05-28 14:11:16 UTC - Alexandre DUVAL: The RESTAPI documentation on the pulsar website doesn't show the body parameters. ---- 2019-05-28 14:21:10 UTC - chris: @Alexandre DUVAL ``` curl -X PUT -H "Content-Type: application/json" <http://localhost:8080/admin/v2/tenants/my-tenant> -d '{"adminRoles":["admin"],"allowedClusters":["test"]}' ``` ---- 2019-05-28 14:38:42 UTC - David Kjerrumgaard: @dba Thanks for contributing back to the community!! slightly_smiling_face : dba ---- 2019-05-28 15:57:57 UTC - Alexandre DUVAL: Thx ---- 2019-05-28 16:52:42 UTC - Thor Sigurjonsson: Anyone have good resources on managing bookkeeper? ---- 2019-05-28 16:52:55 UTC - Thor Sigurjonsson: Thinks like quickly validating if it is healthy, etc? ---- 2019-05-28 16:53:35 UTC - Thor Sigurjonsson: I've only been going to the basic docs at <https://bookkeeper.apache.org> ---- 2019-05-28 16:54:11 UTC - Thor Sigurjonsson: But there are a lot more things exposed in the bookkeeper shell than appear to be covered in those docs. ---- 2019-05-28 16:57:07 UTC - Thor Sigurjonsson: Context: I'm in the process of fixing a test environment that had some infrastructure issues throw it for a loop, so I'm doing more bk stuff than I normally have to. ---- 2019-05-28 17:12:48 UTC - Thor Sigurjonsson: I'm guessing there might be both a metrics approach and a bookkeeper cli approach... ---- 2019-05-28 17:49:28 UTC - Thor Sigurjonsson: I'm finding that the important topic `<persistent://public/functions/assignments>` is in a bad state... ---- 2019-05-28 17:49:37 UTC - Thor Sigurjonsson: Makes it hard to get things started :slightly_smiling_face: ---- 2019-05-28 17:50:40 UTC - Jerry Peng: @Thor Sigurjonsson you can try: ``` $ bin/bookkeeper shell bookiesanity ``` +1 : Thor Sigurjonsson ---- 2019-05-28 17:50:51 UTC - Jerry Peng: <http://pulsar.apache.org/docs/en/deploy-bare-metal-multi-cluster/#starting-up-bookies> ---- 2019-05-28 17:50:58 UTC - Thor Sigurjonsson: Thank you! ---- 2019-05-28 17:51:16 UTC - Thor Sigurjonsson: Are there good ways to recover topics? I'm seeing this for example: ```7:27:15.721 [BookKeeperClientWorker-OrderedExecutor-4-0] ERROR org.apache.bookkeeper.client.PendingReadOp - Read of ledger entry failed: L1053636 E0-E0, Sent to [<http://dec07.overstock.com:3181|dec07.overstock.com:3181>, <http://dec01.overstock.com:3181|dec01.overstock.com:3181>], Heard from [] : bitset = {}, Error = 'No such ledger exists'. First unread entry is (-1, rc = null) 17:27:15.721 [bookkeeper-ml-workers-OrderedExecutor-0-0] WARN org.apache.bookkeeper.mledger.impl.OpReadEntry - [public/functions/persistent/assignments][null] read failed from ledger at position:1053636:0 : No such ledger exists 17:27:15.721 [broker-topic-workers-OrderedScheduler-7-0] ERROR org.apache.pulsar.broker.service.persistent.PersistentDispatcherSingleActiveConsumer - [<persistent://public/functions/assignments> / -Consumer{subscription=PersistentSubscription{topic=<persistent://public/functions/assignments>, name=reader-a3471ea427}, consumerId=2, consumerName=4b21c, address=/10.15.34.180:35892}] Error reading entries at 1053636:0 : No such ledger exists - Retrying to read in 55.11 seconds``` ---- 2019-05-28 17:54:26 UTC - Thor Sigurjonsson: And `bin/bookkeeper shell bookiesanity` test succeeded on all my bookies ---- 2019-05-28 17:57:59 UTC - Thor Sigurjonsson: Also, at this point I'm not seeing most of the topics (yet?) on it. ---- 2019-05-28 17:58:14 UTC - Jerry Peng: @Thor Sigurjonsson I am not an bookkeeper expert perhaps @Matteo Merli @Sijie Guo can chime in ---- 2019-05-28 17:59:03 UTC - Jerry Peng: @Thor Sigurjonsson can you describe what exactly happened to cause this? ---- 2019-05-28 17:59:10 UTC - Thor Sigurjonsson: @Jerry Peng Cool, thank you. I'm taking the opportunity to learn by doing on this one. We may fall back to a redeploy on this test cluster, but until @Devin G. Bost is ready to do that, I'm seeing what I can learn and recover. ---- 2019-05-28 18:00:24 UTC - Thor Sigurjonsson: @Jerry Peng Sure, we've got some desktops as a test environment. Our IT group has us do DHCP and we have to supply 802-1x credentials to get these machines on the network. ---- 2019-05-28 18:00:37 UTC - Thor Sigurjonsson: They went through a creds reset cycle. ---- 2019-05-28 18:00:50 UTC - Thor Sigurjonsson: and so they all lost network about the same time and came back to the network about the same time. ---- 2019-05-28 18:01:11 UTC - Thor Sigurjonsson: So the cluster came up in a funny state. ---- 2019-05-28 18:01:45 UTC - Thor Sigurjonsson: Probably the brokers got bookies back in an unusual state with the brokers joining sporadically ---- 2019-05-28 18:01:52 UTC - Jerry Peng: so the machines now have different IPs? ---- 2019-05-28 18:02:04 UTC - Thor Sigurjonsson: jars may not have been available for functions or ledgers for topics either. ---- 2019-05-28 18:02:48 UTC - Thor Sigurjonsson: We went through that last time, I changed the way bookkeeper tracks the bookie IDs form learning from that prior experience. ---- 2019-05-28 18:03:03 UTC - Thor Sigurjonsson: you can do hostnames instead of IPs, if I remember correctly ---- 2019-05-28 18:03:12 UTC - Jerry Peng: yes ---- 2019-05-28 18:03:20 UTC - Jerry Peng: so those haven’t changed? ---- 2019-05-28 18:03:30 UTC - Thor Sigurjonsson: hostnames should be the same ---- 2019-05-28 18:04:12 UTC - Jerry Peng: none of the data on the bookies were lost? ---- 2019-05-28 18:04:27 UTC - Thor Sigurjonsson: It does complain about missing ledgers, but the volumes came back fine. ---- 2019-05-28 18:04:50 UTC - Thor Sigurjonsson: I'm mounting them from the host into the container running the bookeeper processes ---- 2019-05-28 18:05:11 UTC - Thor Sigurjonsson: that's been working fine for months ---- 2019-05-28 18:05:11 UTC - Jerry Peng: well the metadata in zookeeper indicates there will be copies of the ledger on <http://dec07.overstock.com:3181|dec07.overstock.com:3181>, <http://dec01.overstock.com:3181|dec01.overstock.com:3181> ---- 2019-05-28 18:05:52 UTC - Jerry Peng: Is your DHCP resolving the correct IPs for those hostnames ---- 2019-05-28 18:06:18 UTC - Thor Sigurjonsson: ```[root@dec01 ~]# salt '*' test.ping <http://dec06.overstock.com|dec06.overstock.com>: True <http://dec05.overstock.com|dec05.overstock.com>: True <http://dec04.overstock.com|dec04.overstock.com>: True <http://dec07.overstock.com|dec07.overstock.com>: True <http://dec02.overstock.com|dec02.overstock.com>: True <http://dec03.overstock.com|dec03.overstock.com>: True <http://dec01.overstock.com|dec01.overstock.com>: True``` ---- 2019-05-28 18:06:23 UTC - Thor Sigurjonsson: it would appear so yes. ---- 2019-05-28 18:07:22 UTC - Thor Sigurjonsson: I think we've "lost" bookies sporadically in this same fashion, we had different people's creds on each host. ---- 2019-05-28 18:07:42 UTC - Thor Sigurjonsson: I've unified them to one for now so we have this happen more predictably :slightly_smiling_face: ---- 2019-05-28 18:09:35 UTC - Thor Sigurjonsson: I configured bookkeeper.conf `useHostNameAsBookieID=true` last time around. ---- 2019-05-28 18:09:42 UTC - Thor Sigurjonsson: I could see that it took on all the hosts. ---- 2019-05-28 18:09:52 UTC - Thor Sigurjonsson: in case we had a bad bookie all this time. ---- 2019-05-28 18:09:59 UTC - Jerry Peng: Though I am wondering if <http://dec07.overstock.com:3181|dec07.overstock.com:3181>, <http://dec01.overstock.com:3181|dec01.overstock.com:3181> is resolving to IPs of different machines after moving to DHCP? ---- 2019-05-28 18:10:33 UTC - Jerry Peng: is there a way for you to verify that <http://dec07.overstock.com:3181|dec07.overstock.com:3181>, <http://dec01.overstock.com:3181|dec01.overstock.com:3181> still points to the same machines as before? ---- 2019-05-28 18:10:35 UTC - Thor Sigurjonsson: It could, yes. Salt would not have come back `True` if it was a bad salt-minion. ---- 2019-05-28 18:10:50 UTC - Jerry Peng: i see ---- 2019-05-28 18:11:28 UTC - Thor Sigurjonsson: Yes, I'm logging into both with no name weirdness. ---- 2019-05-28 18:12:43 UTC - Thor Sigurjonsson: `useHostNameAsBookieID` is true on all the machines. ---- 2019-05-28 18:12:59 UTC - Thor Sigurjonsson: I did have to edit some files in the data dir back then too (that was months ago). ---- 2019-05-28 18:13:05 UTC - Thor Sigurjonsson: Wonder if that is still good. ---- 2019-05-28 18:13:18 UTC - Thor Sigurjonsson: (if we had a bad bookie come back after all this time) ---- 2019-05-28 18:14:42 UTC - Thor Sigurjonsson: I edited this file back then to replace `bookieHost` IPs with hostnames. ``` [root@dec01 current]# pwd /data/bookie/bookkeeper/ledgers/current [root@dec01 current]# cat VERSION 4 bookieHost: "<http://dec01.overstock.com:3181|dec01.overstock.com:3181>" journalDir: "data/bookkeeper/journal" ledgerDirs: "1\tdata/bookkeeper/ledgers" instanceId: "afb64bb7-d64d-40f0-8628-74ec84f0eca7"``` ---- 2019-05-28 18:15:20 UTC - Thor Sigurjonsson: (Hope this troubleshooting spam on General channel is useful to someone else later :) ---- 2019-05-28 18:16:42 UTC - Thor Sigurjonsson: ```[root@dec01 pulsar]# salt '*' cmd.run 'cat /data/bookie/bookkeeper/ledgers/current/VERSION | grep bookieHost' <http://dec06.overstock.com|dec06.overstock.com>: bookieHost: "<http://dec06.overstock.com:3181|dec06.overstock.com:3181>" <http://dec05.overstock.com|dec05.overstock.com>: bookieHost: "<http://dec05.overstock.com:3181|dec05.overstock.com:3181>" <http://dec02.overstock.com|dec02.overstock.com>: bookieHost: "<http://dec02.overstock.com:3181|dec02.overstock.com:3181>" <http://dec04.overstock.com|dec04.overstock.com>: bookieHost: "<http://dec04.overstock.com:3181|dec04.overstock.com:3181>" <http://dec07.overstock.com|dec07.overstock.com>: bookieHost: "<http://dec07.overstock.com:3181|dec07.overstock.com:3181>" <http://dec01.overstock.com|dec01.overstock.com>: bookieHost: "<http://dec01.overstock.com:3181|dec01.overstock.com:3181>" <http://dec03.overstock.com|dec03.overstock.com>: bookieHost: "<http://dec03.overstock.com:3181|dec03.overstock.com:3181>"``` All looking consistent. ---- 2019-05-28 18:20:37 UTC - Thor Sigurjonsson: I guess one question I would have is "can we reset topics somehow?" ---- 2019-05-28 18:21:49 UTC - Thor Sigurjonsson: Also, "what is default replication factor in bookkeeper for ledgers?" and "..is that configurable?" We set the jar replication rate to 3 previously. ---- 2019-05-28 18:22:49 UTC - Thor Sigurjonsson: Also, "is there a way to specifiy replication rate on a per-topic/namespace level"?. ---- 2019-05-28 18:24:16 UTC - Jerry Peng: @Thor Sigurjonsson you can always delete the topic as a way of reseting it ---- 2019-05-28 18:25:00 UTC - Jerry Peng: This is ok for the function’s “assignment” topic but not for the “metadata” topic ---- 2019-05-28 18:26:16 UTC - Thor Sigurjonsson: Yes, that makes sense... I guess for general topics, if we do the various policies on those, we'd just want those recreated as well. That's starting to look more like a 're-deploy' then also. ---- 2019-05-28 18:26:55 UTC - Thor Sigurjonsson: I'll see what progress I can make deleting that functions assignments topic (we may have more issues). ---- 2019-05-28 18:28:31 UTC - Thor Sigurjonsson: Although I'm definetly seeing `HTTP 412 Precondition Failed` with active subscribers. ---- 2019-05-28 18:28:51 UTC - Matteo Merli: @Thor Sigurjonsson from the log above, is it possible that DNS name for some of the bookies are pointing to the IP of different bookies, after the DHCP change? ---- 2019-05-28 18:29:41 UTC - Thor Sigurjonsson: @Matteo Merli It's conceivable.. dec01 is not one of them though. It's the salt master and I'd know it as I log in. ---- 2019-05-28 18:30:19 UTC - Thor Sigurjonsson: Although I think salt keys would prevent them from coming back with changed host keys. ---- 2019-05-28 18:31:02 UTC - Thor Sigurjonsson: Looks like `salt-key -L` lists them all as normal accepted keys and no bad keys in the mix. ---- 2019-05-28 18:34:29 UTC - Matteo Merli: I guess. but the fact that broker connects to one bookies, thinking the ledger is there (based on the hostname) and the bookie not having that ledger ---- 2019-05-28 18:34:57 UTC - Matteo Merli: One thing you could try is to read that ledger from all the bookies. Let me get the command +1 : Thor Sigurjonsson ---- 2019-05-28 18:36:47 UTC - Matteo Merli: `bin/bookkeeper shell readledger -bookie $BOOKIE:3181 -ledgerid $LEDGER` and try against all of them ---- 2019-05-28 18:41:10 UTC - Thor Sigurjonsson: Here is what ran: ``` root@dec01:/pulsar# BOOKIE=`hostname` root@dec01:/pulsar# LEDGER=1053636 root@dec01:/pulsar# bin/bookkeeper shell readledger -bookie $BOOKIE:3181 -ledgerid $LEDGER JMX enabled by default ``` ---- 2019-05-28 18:41:43 UTC - Thor Sigurjonsson: Ran that same on all of them and they all came back same way. Did I do that right? ---- 2019-05-28 18:47:32 UTC - Matteo Merli: It should print the entry being read there ---- 2019-05-28 18:47:55 UTC - Matteo Merli: which should be none, if the bookies don’t have that ledger ---- 2019-05-28 21:06:00 UTC - Ali Ahmed: just gauging the community interest <https://github.com/celery/celery/issues/5487> ---- 2019-05-28 21:06:26 UTC - Ali Ahmed: is a distributed task queue integration the pulsar community looking for ? ---- 2019-05-28 21:45:33 UTC - Grant Wu: That is something we would use if it existed ---- 2019-05-28 21:47:26 UTC - Ali Ahmed: ok is celery still considered the standard or is another task queue gaining more popularity ? ---- 2019-05-28 21:48:58 UTC - Grant Wu: I wouldn’t know, I’m not involved in the celery use here ---- 2019-05-28 21:49:15 UTC - Ezequiel Lovelle: I think it was the standard way with django some time ago. ---- 2019-05-28 21:49:25 UTC - Ezequiel Lovelle: I'm not a big fan of celery but this is great! ---- 2019-05-28 21:50:22 UTC - Ali Ahmed: I think it can be done with selective ack in pulsar without all the baggage celery has. ---- 2019-05-28 21:50:28 UTC - Ali Ahmed: it will be much simpler ---- 2019-05-28 21:52:55 UTC - Grant Wu: I really don’t think Pulsar are Celery are replacements for each other ---- 2019-05-28 21:54:07 UTC - Grant Wu: The primitives Pulsar provides are fairly low-level; using pubsub in a disciplined way is not easy ---- 2019-05-28 22:20:27 UTC - Ali Ahmed: Celery is mostly an api it has plugins for different persistence layers , rabbimtq and redis are the most popular, there is partial support for kafka as well. ----
