Slack digest for #general - 2019-02-28

Apache Pulsar Slack Thu, 28 Feb 2019 01:11:47 -0800

2019-02-27 09:20:22 UTC - Maarten Tielemans: @Matteo Merli is there a feature 
ticket for OpenTracing? just so I can +1 it
----
2019-02-27 09:28:40 UTC - Marc Le Labourier: I have done nothing to change it. 
Where and how should it be defined ?
----
2019-02-27 10:04:02 UTC - Sébastien de Melo: We left the grafana docker image 
untouched so it is:
"__inputs": [
    {
      "name": "DS_default",
      "label": "default",
      "description": "",
      "type": "datasource",
      "pluginId": "prometheus",
      "pluginName": "Prometheus"
    }
]
----
2019-02-27 10:07:58 UTC - Sébastien de Melo: We have added the following at the 
beginning of monitoring.yaml:
apiVersion: 
<http://rbac.authorization.k8s.io/v1beta1|rbac.authorization.k8s.io/v1beta1>
kind: ClusterRole
metadata:
  name: "prometheus"
  labels:
      app: pulsar
      cluster: pulsar
rules:
- apiGroups: [""]
  resources:
  - nodes
  - nodes/proxy
  - services
  - endpoints
  - pods
  verbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics"]
  verbs: ["get"]
---


apiVersion: v1
kind: ServiceAccount
metadata:
  name: "prometheus"
  namespace: pulsar
  labels:
      app: pulsar
      cluster: pulsar
---

apiVersion: 
<http://rbac.authorization.k8s.io/v1beta1|rbac.authorization.k8s.io/v1beta1>
kind: ClusterRoleBinding
metadata:
  name: "prometheus"
  labels:
      app: pulsar
      cluster: pulsar
roleRef:
  apiGroup: <http://rbac.authorization.k8s.io|rbac.authorization.k8s.io>
  kind: ClusterRole
  name: "prometheus"
subjects:
- kind: ServiceAccount
  name: "prometheus"
  namespace: pulsar

You have to adapt it to your needs
----
2019-02-27 10:09:07 UTC - Sébastien de Melo: And we had to add serviceAccount: 
prometheus in the spec section of the prometheus Deployment:
apiVersion: apps/v1beta1
kind: Deployment
metadata:
    name: prometheus
    labels:
        app: pulsar
        cluster: pulsar
spec:
    replicas: 1
    template:
        metadata:
            labels:
                app: pulsar
                component: prometheus
                cluster: pulsar
        spec:
            containers:
              - name: prometheus
                image: prom/prometheus:v1.6.3
                volumeMounts:
                  - name: config-volume
                    mountPath: /etc/prometheus
                  - name: data-volume
                    mountPath: /prometheus
                ports:
                  - containerPort: 9090
            volumes:
              - name: config-volume
                configMap:
                    name: prometheus-config
              - name: data-volume
                persistentVolumeClaim:
                  claimName: prometheus-data-volume
            nodeSelector:
                onlyfor: pulsar
            serviceAccount: prometheus
----
2019-02-27 10:59:25 UTC - Marc Le Labourier: Anyone working with kubernetes ? 
We seems to have problems with prometheus and the broker metrics. Some of them 
are duplicated when a new topic is created.

Here you can see that pulsar_subscriptions_count is defined multiple times.
----
2019-02-27 11:13:51 UTC - Sijie Guo: the metrics are having different labels, 
no?
----
2019-02-27 13:31:22 UTC - Sébastien de Melo: The reported error is: text format 
parsing error in line 189: second TYPE line for metric name 
“pulsar_subscriptions_count”, or TYPE reported after samples
----
2019-02-27 13:52:10 UTC - Maarten Tielemans: There are two definitions `# TYPE 
pulsar_subscriptions_count gauge` (line 167 and line 189)

Line 167 seems to be a subscription count across all different topics? Line 189 
seems to be per topic?
----
2019-02-27 14:04:41 UTC - Marc Le Labourier: Yes, thats the problem. Prometheus 
does not seems to like multiple definitions (even if the parameters are 
different.
However, I don’t know the origin of this behaviour and how to fix the broker 
metrics.
----
2019-02-27 14:16:07 UTC - Sébastien de Melo: 
----
2019-02-27 14:19:48 UTC - Marc Le Labourier: Seems to be related to this issue: 
<https://github.com/apache/pulsar/issues/3112>
----
2019-02-27 14:32:57 UTC - Sijie Guo: i see. a temp workaround is to disable 
topic metrics `exposeTopicLevelMetricsInPrometheus=false`
----
2019-02-27 14:36:34 UTC - Sébastien de Melo: Ok thanks, testing it
----
2019-02-27 15:05:02 UTC - Matteo Merli: Which version of Prometheus is this 
running with?
----
2019-02-27 15:08:57 UTC - Sébastien de Melo: v1.6.3
----
2019-02-27 15:09:39 UTC - Sébastien de Melo: we are using monitoring.yaml in 
deployment/kubernetes/aws
----
2019-02-27 15:09:41 UTC - Matteo Merli: Ok , we are running with 2.4.x without 
any problems
----
2019-02-27 15:10:05 UTC - Matteo Merli: These yank should be updated at this 
point 
----
2019-02-27 15:28:09 UTC - David Kjerrumgaard: No worries. It looks like you 
found the issue and potential fix
----
2019-02-27 16:11:53 UTC - Chris DiGiovanni: I was wondering if someone can 
comment on this scenario.

I have deployed a Pulsar cluster into custom k8s using the templates provided 
w/ the modification of using Stateful Sets for the Bookies.  I've deployed 3 
bookies, each w/  500G ledger disks.  In this scenario I start running out of 
space.  What is the procedure?
----
2019-02-27 16:14:53 UTC - Chris DiGiovanni: Reading the Apache Bookie docs I 
don't see how you can re-balance the data onto additional Bookies.
----
2019-02-27 16:17:10 UTC - Matteo Merli: You can just add more nodes to augment 
the overall capacity of the cluster
----
2019-02-27 16:17:50 UTC - Chris DiGiovanni: nodes = bookies?
----
2019-02-27 16:17:57 UTC - Matteo Merli: There’s no need to explicitly rebalance 
the data
----
2019-02-27 16:17:59 UTC - Matteo Merli: Yes
----
2019-02-27 16:18:33 UTC - Chris DiGiovanni: I see, so Pular will know if a 
Bookie's disk is running low and there is space available elsewhere?
----
2019-02-27 16:19:18 UTC - Matteo Merli: Bookies will automatically go into 
read-only mode when disk is (almost) full. When data ages out, the disk usage 
will equalize across all bookies 
----
2019-02-27 16:20:05 UTC - Chris DiGiovanni: @Matteo Merli Thanks for your reply
----
2019-02-27 17:14:27 UTC - Sébastien de Melo: Thank you, version 2.7.1 of 
prometheus fixed the issue.  However it didn't solve the other one (topics):
Templating init failed
Datasource named ${DS_default} was not found
Do you have any clue?
----
2019-02-27 17:15:54 UTC - Matteo Merli: Not sure about that. Seems to be 
related to grafana not finding the default data source 
----
2019-02-27 17:16:14 UTC - Matteo Merli: You can check in the grafana settings 
for the data sources
----
2019-02-27 17:17:39 UTC - Sébastien de Melo: We checked.  Prometheus is the 
default (and only) data source
----
2019-02-27 17:37:57 UTC - Ryan Samo: Hey guys, is bookie auto recovery truly 
automatic in that if I take nodes down it will attempt to recovery the data to 
the available nodes? Is that a separate thread running? We have 6 bookies, just 
trying to make sure we are highly available with the ensemble, write quorum, 
and ack quorum.

Thanks!
----
2019-02-27 17:38:57 UTC - Matteo Merli: That is correct 
----
2019-02-27 17:38:59 UTC - Ryan Samo: Also trying to figure out rack aware 
around that
----
2019-02-27 17:39:15 UTC - Matteo Merli: The auto recovery process is turned on 
by default
----
2019-02-27 17:40:48 UTC - Ryan Samo: So seeing errors like 
BKLedgerRecoveryException: Error whole recovering ledger

Is normal?
----
2019-02-27 17:40:55 UTC - Ryan Samo: While
----
2019-02-27 17:45:03 UTC - Ryan Samo: Tailing the bookie logs will show that 
error over and over until we bring them back up. I just wondered if it would 
auto recover and clear up
----
2019-02-27 18:05:06 UTC - Sébastien de Melo: What are we supposed to see in the 
topics dashboard please?  Query '{topic=~".+"}' in Prometheus returns no results
----
2019-02-27 18:07:40 UTC - Marc Le Labourier: Topics stats seems to be collected 
from topics_stats = get(broker_url, ‘/admin/broker-stats/destinations’)

So I guess that they are not available through prometheus as it only use the 
/metrics url ?
----
2019-02-27 20:36:50 UTC - Grant Wu: @Matteo Merli
```
19:41:53.512 [BookKeeperClientWorker-OrderedExecutor-1-0] ERROR 
org.apache.bookkeeper.common.util.SafeRunnable - Unexpected throwable caught 
java.lang.ArrayIndexOutOfBoundsException: -2
        at 
com.google.common.collect.RegularImmutableList.get(RegularImmutableList.java:60)
 ~[com.google.guava-guava-21.0.jar:?]
        at 
org.apache.bookkeeper.client.PendingReadOp$SequenceReadRequest.sendNextRead(PendingReadOp.java:400)
 ~[org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0]
        at 
org.apache.bookkeeper.client.PendingReadOp$SequenceReadRequest.read(PendingReadOp.java:382)
 ~[org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0]
        at 
org.apache.bookkeeper.client.PendingReadOp.initiate(PendingReadOp.java:526) 
~[org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0]
        at 
org.apache.bookkeeper.client.LedgerRecoveryOp.doRecoveryRead(LedgerRecoveryOp.java:148)
 ~[org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0]
        at 
org.apache.bookkeeper.client.LedgerRecoveryOp.access$000(LedgerRecoveryOp.java:37)
 ~[org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0]
        at 
org.apache.bookkeeper.client.LedgerRecoveryOp$1.readLastConfirmedDataComplete(LedgerRecoveryOp.java:109)
 ~[org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0]
        at 
org.apache.bookkeeper.client.ReadLastConfirmedOp.readEntryComplete(ReadLastConfirmedOp.java:135)
 ~[org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0]
        at 
org.apache.bookkeeper.proto.PerChannelBookieClient$ReadCompletion$1.readEntryComplete(PerChannelBookieClient.java:1797)
 ~[org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0]
        at 
org.apache.bookkeeper.proto.PerChannelBookieClient$ReadCompletion.handleReadResponse(PerChannelBookieClient.java:1878)
 ~[org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0]
        at 
org.apache.bookkeeper.proto.PerChannelBookieClient$ReadCompletion.handleV2Response(PerChannelBookieClient.java:1831)
 ~[org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0]
        at 
org.apache.bookkeeper.proto.PerChannelBookieClient$ReadV2ResponseCallback.safeRun(PerChannelBookieClient.java:1321)
 ~[org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0]
        at 
org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) 
[org.apache.bookkeeper-bookkeeper-common-4.9.0.jar:4.9.0]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_181]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_181]
        at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
 [io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
```
----
2019-02-27 20:37:08 UTC - Grant Wu: Is this related to the patch you made for 
the other issue related to Pulsar Function workers we had?
----
2019-02-27 20:44:03 UTC - Grant Wu: and now the Pulsar Function worker isn’t 
initializing
----
2019-02-27 20:44:18 UTC - Grant Wu: ```
root@pulsar-broker-56d59cf97d-zwtxf:/pulsar# bin/pulsar-admin functions list
Function worker service is not done initializing. Please try again in a little 
while.

Reason: HTTP 503 Service Unavailable
```
----
2019-02-27 21:22:26 UTC - Jacob O'Farrell: Thanks @Matteo Merli Investigating 
your suggestion of OpenTracing now. I agree with @Maarten Tielemans, is there 
somewhere I can hit +1 for the integration?
----
2019-02-27 21:23:30 UTC - Sanjeev Kulkarni: @Grant Wu does this 503 linger for 
a long time? I know this happens during a start of the worker, but barring 
other errros inside the worker, it should accept requests after some time
----
2019-02-27 21:23:53 UTC - Grant Wu: It has lingered for like an hour now
----
2019-02-27 21:52:12 UTC - Matteo Merli: I don’t think there’s an issue created 
yet. Please create it!

Also, contributions are welcome :slightly_smiling_face:
----
2019-02-27 21:55:01 UTC - Grant Wu: Any clue where to find function worker 
logs? @Sanjeev Kulkarni
----
2019-02-27 21:55:11 UTC - Grant Wu: They’re not inside the logs directory…
----
2019-02-27 22:06:30 UTC - Sanjeev Kulkarni: there wont be any function logs 
here. Since there are no functions accepted/launched
----
2019-02-27 22:11:42 UTC - Grant Wu: I see
----
2019-02-27 22:26:19 UTC - Ryan Samo: Hey guys,
When I try to delete a bookie rack configuration it shows as deleted and then 
shows right back up again like T-1000 in The Terminator... is this expected 
behavior or am I supposed to do something different than 

$PULSAR_HOME/bin/pulsar-admin bookies delete-bookie-rack --bookie 127.0.0.1:3181
----
2019-02-27 23:26:02 UTC - David Kjerrumgaard: That is the expected behavior of 
the T-1000, not for the rack config
joy : Emma Pollum, Ryan Samo, Ali Ahmed
----
2019-02-28 00:51:41 UTC - Ryan Samo: Right!
----
2019-02-28 00:53:19 UTC - Ryan Samo: Haha anyhow I feel like I might be missing 
something important here. It’s seems straight forward enough documentation wise 
but yet you can only add rack configs, never deleting them. Any advice?
----
2019-02-28 01:18:31 UTC - David Kjerrumgaard: I am digging through the code 
now, will keep you updated
----
2019-02-28 03:42:40 UTC - Ryan Samo: Thanks @David Kjerrumgaard !
----
2019-02-28 03:54:08 UTC - David Kjerrumgaard: @Ryan Samo Which version of 
Pulsar are you running?
----
2019-02-28 03:55:50 UTC - Ryan Samo: @David Kjerrumgaard 2.2.0
----
2019-02-28 04:00:50 UTC - David Kjerrumgaard: You can try the REST API.
----
2019-02-28 04:01:54 UTC - David Kjerrumgaard: `curl -X "DELETE" 
http://{pulsar-broker}/admin/v2/bookies/racks-info/{bookie}`
----
2019-02-28 04:02:25 UTC - David Kjerrumgaard: 
<http://pulsar.apache.org/en/admin-rest-api/#operation/deleteBookieRackInfo>
----
2019-02-28 04:03:38 UTC - David Kjerrumgaard: If it doesn't work then we might 
have a bug, so you should file an issue on the Apache site
----
2019-02-28 04:07:31 UTC - Ryan Samo: @David Kjerrumgaard Ok I’ll give it 
another shot in the morning. Do I need to execute it on all nodes or just one 
and it propagates? Also do I need to bounce anything? I tried the delete via 
curl and postman REST this evening. It shows to be deleted but only on the node 
you execute the delete command on, the other nodes still show the node as 
undeleted. So then I ran the delete on all nodes at once and it showed to be 
correct, but only for a few seconds, then the old config returns i.e. T-1000 
effect lol
----
2019-02-28 04:10:21 UTC - Ryan Samo: From that behavior, I might issue a bug 
request as you mentioned 
----
2019-02-28 05:37:54 UTC - David Kjerrumgaard: yea its sound like it. Thank you 
for trying it out.
----
2019-02-28 07:16:21 UTC - bhagesharora: How to check(command) list of existing 
topics in apache pulsar cluster. I tried out : bin/pulsar-admin topics, 
bin/pulsar-admin topics list, bin/pulsar-admin topics list ??
----
2019-02-28 07:18:02 UTC - Ali Ahmed: ```pulsar-admin topics list```
----
2019-02-28 07:18:14 UTC - Ali Ahmed: with a namespace parameter should be fine
+1 : bhagesharora
----
2019-02-28 07:44:20 UTC - bossbaby: Hello all,
How to add token with Client Authentication in RestAPI?
----
2019-02-28 08:32:01 UTC - bhagesharora: I created one Function for wordcount 
using this command.                                                           
bin/pulsar-admin functions create --jar 
target/pulsar-function-0.0.1-SNAPSHOT-jar-with-dependencies.jar --className 
pulsar_function.pulsar_fu
nction.WordCountFunction --tenant public --namespace default --name word-count 
--inputs <persistent://public/default/sentences>  --output 
<persistent://public/default/count>        I created one class 
WordCountFunction in java and create a jar now I am trying to Run same on 
cluster mode, I used above command to create a function which is successfully.  
          How I can see the output like passing one sentense and we have no. of 
word count in another topic, What steps should I follow ??
----
2019-02-28 08:34:18 UTC - Ali Ahmed: @bhagesharora run a simple producer of 
sentences to the input topic keep a client open on the output topic
----

Slack digest for #general - 2019-02-28

Reply via email to