2019-02-27 09:20:22 UTC - Maarten Tielemans: @Matteo Merli is there a feature
ticket for OpenTracing? just so I can +1 it
----
2019-02-27 09:28:40 UTC - Marc Le Labourier: I have done nothing to change it.
Where and how should it be defined ?
----
2019-02-27 10:04:02 UTC - Sébastien de Melo: We left the grafana docker image
untouched so it is:
"__inputs": [
{
"name": "DS_default",
"label": "default",
"description": "",
"type": "datasource",
"pluginId": "prometheus",
"pluginName": "Prometheus"
}
]
----
2019-02-27 10:07:58 UTC - Sébastien de Melo: We have added the following at the
beginning of monitoring.yaml:
apiVersion:
<http://rbac.authorization.k8s.io/v1beta1|rbac.authorization.k8s.io/v1beta1>
kind: ClusterRole
metadata:
name: "prometheus"
labels:
app: pulsar
cluster: pulsar
rules:
- apiGroups: [""]
resources:
- nodes
- nodes/proxy
- services
- endpoints
- pods
verbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics"]
verbs: ["get"]
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: "prometheus"
namespace: pulsar
labels:
app: pulsar
cluster: pulsar
---
apiVersion:
<http://rbac.authorization.k8s.io/v1beta1|rbac.authorization.k8s.io/v1beta1>
kind: ClusterRoleBinding
metadata:
name: "prometheus"
labels:
app: pulsar
cluster: pulsar
roleRef:
apiGroup: <http://rbac.authorization.k8s.io|rbac.authorization.k8s.io>
kind: ClusterRole
name: "prometheus"
subjects:
- kind: ServiceAccount
name: "prometheus"
namespace: pulsar
You have to adapt it to your needs
----
2019-02-27 10:09:07 UTC - Sébastien de Melo: And we had to add serviceAccount:
prometheus in the spec section of the prometheus Deployment:
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: prometheus
labels:
app: pulsar
cluster: pulsar
spec:
replicas: 1
template:
metadata:
labels:
app: pulsar
component: prometheus
cluster: pulsar
spec:
containers:
- name: prometheus
image: prom/prometheus:v1.6.3
volumeMounts:
- name: config-volume
mountPath: /etc/prometheus
- name: data-volume
mountPath: /prometheus
ports:
- containerPort: 9090
volumes:
- name: config-volume
configMap:
name: prometheus-config
- name: data-volume
persistentVolumeClaim:
claimName: prometheus-data-volume
nodeSelector:
onlyfor: pulsar
serviceAccount: prometheus
----
2019-02-27 10:59:25 UTC - Marc Le Labourier: Anyone working with kubernetes ?
We seems to have problems with prometheus and the broker metrics. Some of them
are duplicated when a new topic is created.
Here you can see that pulsar_subscriptions_count is defined multiple times.
----
2019-02-27 11:13:51 UTC - Sijie Guo: the metrics are having different labels,
no?
----
2019-02-27 13:31:22 UTC - Sébastien de Melo: The reported error is: text format
parsing error in line 189: second TYPE line for metric name
“pulsar_subscriptions_count”, or TYPE reported after samples
----
2019-02-27 13:52:10 UTC - Maarten Tielemans: There are two definitions `# TYPE
pulsar_subscriptions_count gauge` (line 167 and line 189)
Line 167 seems to be a subscription count across all different topics? Line 189
seems to be per topic?
----
2019-02-27 14:04:41 UTC - Marc Le Labourier: Yes, thats the problem. Prometheus
does not seems to like multiple definitions (even if the parameters are
different.
However, I don’t know the origin of this behaviour and how to fix the broker
metrics.
----
2019-02-27 14:16:07 UTC - Sébastien de Melo:
----
2019-02-27 14:19:48 UTC - Marc Le Labourier: Seems to be related to this issue:
<https://github.com/apache/pulsar/issues/3112>
----
2019-02-27 14:32:57 UTC - Sijie Guo: i see. a temp workaround is to disable
topic metrics `exposeTopicLevelMetricsInPrometheus=false`
----
2019-02-27 14:36:34 UTC - Sébastien de Melo: Ok thanks, testing it
----
2019-02-27 15:05:02 UTC - Matteo Merli: Which version of Prometheus is this
running with?
----
2019-02-27 15:08:57 UTC - Sébastien de Melo: v1.6.3
----
2019-02-27 15:09:39 UTC - Sébastien de Melo: we are using monitoring.yaml in
deployment/kubernetes/aws
----
2019-02-27 15:09:41 UTC - Matteo Merli: Ok , we are running with 2.4.x without
any problems
----
2019-02-27 15:10:05 UTC - Matteo Merli: These yank should be updated at this
point
----
2019-02-27 15:28:09 UTC - David Kjerrumgaard: No worries. It looks like you
found the issue and potential fix
----
2019-02-27 16:11:53 UTC - Chris DiGiovanni: I was wondering if someone can
comment on this scenario.
I have deployed a Pulsar cluster into custom k8s using the templates provided
w/ the modification of using Stateful Sets for the Bookies. I've deployed 3
bookies, each w/ 500G ledger disks. In this scenario I start running out of
space. What is the procedure?
----
2019-02-27 16:14:53 UTC - Chris DiGiovanni: Reading the Apache Bookie docs I
don't see how you can re-balance the data onto additional Bookies.
----
2019-02-27 16:17:10 UTC - Matteo Merli: You can just add more nodes to augment
the overall capacity of the cluster
----
2019-02-27 16:17:50 UTC - Chris DiGiovanni: nodes = bookies?
----
2019-02-27 16:17:57 UTC - Matteo Merli: There’s no need to explicitly rebalance
the data
----
2019-02-27 16:17:59 UTC - Matteo Merli: Yes
----
2019-02-27 16:18:33 UTC - Chris DiGiovanni: I see, so Pular will know if a
Bookie's disk is running low and there is space available elsewhere?
----
2019-02-27 16:19:18 UTC - Matteo Merli: Bookies will automatically go into
read-only mode when disk is (almost) full. When data ages out, the disk usage
will equalize across all bookies
----
2019-02-27 16:20:05 UTC - Chris DiGiovanni: @Matteo Merli Thanks for your reply
----
2019-02-27 17:14:27 UTC - Sébastien de Melo: Thank you, version 2.7.1 of
prometheus fixed the issue. However it didn't solve the other one (topics):
Templating init failed
Datasource named ${DS_default} was not found
Do you have any clue?
----
2019-02-27 17:15:54 UTC - Matteo Merli: Not sure about that. Seems to be
related to grafana not finding the default data source
----
2019-02-27 17:16:14 UTC - Matteo Merli: You can check in the grafana settings
for the data sources
----
2019-02-27 17:17:39 UTC - Sébastien de Melo: We checked. Prometheus is the
default (and only) data source
----
2019-02-27 17:37:57 UTC - Ryan Samo: Hey guys, is bookie auto recovery truly
automatic in that if I take nodes down it will attempt to recovery the data to
the available nodes? Is that a separate thread running? We have 6 bookies, just
trying to make sure we are highly available with the ensemble, write quorum,
and ack quorum.
Thanks!
----
2019-02-27 17:38:57 UTC - Matteo Merli: That is correct
----
2019-02-27 17:38:59 UTC - Ryan Samo: Also trying to figure out rack aware
around that
----
2019-02-27 17:39:15 UTC - Matteo Merli: The auto recovery process is turned on
by default
----
2019-02-27 17:40:48 UTC - Ryan Samo: So seeing errors like
BKLedgerRecoveryException: Error whole recovering ledger
Is normal?
----
2019-02-27 17:40:55 UTC - Ryan Samo: While
----
2019-02-27 17:45:03 UTC - Ryan Samo: Tailing the bookie logs will show that
error over and over until we bring them back up. I just wondered if it would
auto recover and clear up
----
2019-02-27 18:05:06 UTC - Sébastien de Melo: What are we supposed to see in the
topics dashboard please? Query '{topic=~".+"}' in Prometheus returns no results
----
2019-02-27 18:07:40 UTC - Marc Le Labourier: Topics stats seems to be collected
from topics_stats = get(broker_url, ‘/admin/broker-stats/destinations’)
So I guess that they are not available through prometheus as it only use the
/metrics url ?
----
2019-02-27 20:36:50 UTC - Grant Wu: @Matteo Merli
```
19:41:53.512 [BookKeeperClientWorker-OrderedExecutor-1-0] ERROR
org.apache.bookkeeper.common.util.SafeRunnable - Unexpected throwable caught
java.lang.ArrayIndexOutOfBoundsException: -2
at
com.google.common.collect.RegularImmutableList.get(RegularImmutableList.java:60)
~[com.google.guava-guava-21.0.jar:?]
at
org.apache.bookkeeper.client.PendingReadOp$SequenceReadRequest.sendNextRead(PendingReadOp.java:400)
~[org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0]
at
org.apache.bookkeeper.client.PendingReadOp$SequenceReadRequest.read(PendingReadOp.java:382)
~[org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0]
at
org.apache.bookkeeper.client.PendingReadOp.initiate(PendingReadOp.java:526)
~[org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0]
at
org.apache.bookkeeper.client.LedgerRecoveryOp.doRecoveryRead(LedgerRecoveryOp.java:148)
~[org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0]
at
org.apache.bookkeeper.client.LedgerRecoveryOp.access$000(LedgerRecoveryOp.java:37)
~[org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0]
at
org.apache.bookkeeper.client.LedgerRecoveryOp$1.readLastConfirmedDataComplete(LedgerRecoveryOp.java:109)
~[org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0]
at
org.apache.bookkeeper.client.ReadLastConfirmedOp.readEntryComplete(ReadLastConfirmedOp.java:135)
~[org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0]
at
org.apache.bookkeeper.proto.PerChannelBookieClient$ReadCompletion$1.readEntryComplete(PerChannelBookieClient.java:1797)
~[org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0]
at
org.apache.bookkeeper.proto.PerChannelBookieClient$ReadCompletion.handleReadResponse(PerChannelBookieClient.java:1878)
~[org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0]
at
org.apache.bookkeeper.proto.PerChannelBookieClient$ReadCompletion.handleV2Response(PerChannelBookieClient.java:1831)
~[org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0]
at
org.apache.bookkeeper.proto.PerChannelBookieClient$ReadV2ResponseCallback.safeRun(PerChannelBookieClient.java:1321)
~[org.apache.bookkeeper-bookkeeper-server-4.9.0.jar:4.9.0]
at
org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36)
[org.apache.bookkeeper-bookkeeper-common-4.9.0.jar:4.9.0]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[?:1.8.0_181]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[?:1.8.0_181]
at
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
[io.netty-netty-all-4.1.32.Final.jar:4.1.32.Final]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
```
----
2019-02-27 20:37:08 UTC - Grant Wu: Is this related to the patch you made for
the other issue related to Pulsar Function workers we had?
----
2019-02-27 20:44:03 UTC - Grant Wu: and now the Pulsar Function worker isn’t
initializing
----
2019-02-27 20:44:18 UTC - Grant Wu: ```
root@pulsar-broker-56d59cf97d-zwtxf:/pulsar# bin/pulsar-admin functions list
Function worker service is not done initializing. Please try again in a little
while.
Reason: HTTP 503 Service Unavailable
```
----
2019-02-27 21:22:26 UTC - Jacob O'Farrell: Thanks @Matteo Merli Investigating
your suggestion of OpenTracing now. I agree with @Maarten Tielemans, is there
somewhere I can hit +1 for the integration?
----
2019-02-27 21:23:30 UTC - Sanjeev Kulkarni: @Grant Wu does this 503 linger for
a long time? I know this happens during a start of the worker, but barring
other errros inside the worker, it should accept requests after some time
----
2019-02-27 21:23:53 UTC - Grant Wu: It has lingered for like an hour now
----
2019-02-27 21:52:12 UTC - Matteo Merli: I don’t think there’s an issue created
yet. Please create it!
Also, contributions are welcome :slightly_smiling_face:
----
2019-02-27 21:55:01 UTC - Grant Wu: Any clue where to find function worker
logs? @Sanjeev Kulkarni
----
2019-02-27 21:55:11 UTC - Grant Wu: They’re not inside the logs directory…
----
2019-02-27 22:06:30 UTC - Sanjeev Kulkarni: there wont be any function logs
here. Since there are no functions accepted/launched
----
2019-02-27 22:11:42 UTC - Grant Wu: I see
----
2019-02-27 22:26:19 UTC - Ryan Samo: Hey guys,
When I try to delete a bookie rack configuration it shows as deleted and then
shows right back up again like T-1000 in The Terminator... is this expected
behavior or am I supposed to do something different than
$PULSAR_HOME/bin/pulsar-admin bookies delete-bookie-rack --bookie 127.0.0.1:3181
----
2019-02-27 23:26:02 UTC - David Kjerrumgaard: That is the expected behavior of
the T-1000, not for the rack config
joy : Emma Pollum, Ryan Samo, Ali Ahmed
----
2019-02-28 00:51:41 UTC - Ryan Samo: Right!
----
2019-02-28 00:53:19 UTC - Ryan Samo: Haha anyhow I feel like I might be missing
something important here. It’s seems straight forward enough documentation wise
but yet you can only add rack configs, never deleting them. Any advice?
----
2019-02-28 01:18:31 UTC - David Kjerrumgaard: I am digging through the code
now, will keep you updated
----
2019-02-28 03:42:40 UTC - Ryan Samo: Thanks @David Kjerrumgaard !
----
2019-02-28 03:54:08 UTC - David Kjerrumgaard: @Ryan Samo Which version of
Pulsar are you running?
----
2019-02-28 03:55:50 UTC - Ryan Samo: @David Kjerrumgaard 2.2.0
----
2019-02-28 04:00:50 UTC - David Kjerrumgaard: You can try the REST API.
----
2019-02-28 04:01:54 UTC - David Kjerrumgaard: `curl -X "DELETE"
http://{pulsar-broker}/admin/v2/bookies/racks-info/{bookie}`
----
2019-02-28 04:02:25 UTC - David Kjerrumgaard:
<http://pulsar.apache.org/en/admin-rest-api/#operation/deleteBookieRackInfo>
----
2019-02-28 04:03:38 UTC - David Kjerrumgaard: If it doesn't work then we might
have a bug, so you should file an issue on the Apache site
----
2019-02-28 04:07:31 UTC - Ryan Samo: @David Kjerrumgaard Ok I’ll give it
another shot in the morning. Do I need to execute it on all nodes or just one
and it propagates? Also do I need to bounce anything? I tried the delete via
curl and postman REST this evening. It shows to be deleted but only on the node
you execute the delete command on, the other nodes still show the node as
undeleted. So then I ran the delete on all nodes at once and it showed to be
correct, but only for a few seconds, then the old config returns i.e. T-1000
effect lol
----
2019-02-28 04:10:21 UTC - Ryan Samo: From that behavior, I might issue a bug
request as you mentioned
----
2019-02-28 05:37:54 UTC - David Kjerrumgaard: yea its sound like it. Thank you
for trying it out.
----
2019-02-28 07:16:21 UTC - bhagesharora: How to check(command) list of existing
topics in apache pulsar cluster. I tried out : bin/pulsar-admin topics,
bin/pulsar-admin topics list, bin/pulsar-admin topics list ??
----
2019-02-28 07:18:02 UTC - Ali Ahmed: ```pulsar-admin topics list```
----
2019-02-28 07:18:14 UTC - Ali Ahmed: with a namespace parameter should be fine
+1 : bhagesharora
----
2019-02-28 07:44:20 UTC - bossbaby: Hello all,
How to add token with Client Authentication in RestAPI?
----
2019-02-28 08:32:01 UTC - bhagesharora: I created one Function for wordcount
using this command.
bin/pulsar-admin functions create --jar
target/pulsar-function-0.0.1-SNAPSHOT-jar-with-dependencies.jar --className
pulsar_function.pulsar_fu
nction.WordCountFunction --tenant public --namespace default --name word-count
--inputs <persistent://public/default/sentences> --output
<persistent://public/default/count> I created one class
WordCountFunction in java and create a jar now I am trying to Run same on
cluster mode, I used above command to create a function which is successfully.
How I can see the output like passing one sentense and we have no. of
word count in another topic, What steps should I follow ??
----
2019-02-28 08:34:18 UTC - Ali Ahmed: @bhagesharora run a simple producer of
sentences to the input topic keep a client open on the output topic
----