2020-03-25 09:39:40 UTC - Pierre-Yves Lebecq: When I have a producer with an
Avro schema creating a topic, I can see the topic and its schema using the CLI.
When I add a function on the topic, I have the following error:
`java.lang.ClassCastException: org.apache.avro.generic.GenericData$Record
cannot be cast to
com.zenaton.engine.workflowengine.messages.DispatchTaskMessage`
Whereas if I do the exact same thing but using JSONSchema.of() instead of
AvroSchema.of() for the producer when it creates the topic, everything works
fine when I using a function on this topic.
----
2020-03-25 09:42:37 UTC - Yosi Attias: @Gilles Barbier thanks!
I found a solution to my parallelism issue, Instead of have key (event) to list
of subscriptions which can cause issues.
I will have key like `Event/SubscriptionId` to it’s subscription. this way I
won’t have any concurrency issue, and I can use the failover subscription mode
to handle the issue.
(subscriptions I am saying above, is my application subscriptions and not
pulsar subscriptions).
The only thing missing right now, is adding range read so I can read all
subscriptions given event.
----
2020-03-25 09:50:11 UTC - xue: Does pulsar function support the spring
framework?
----
2020-03-25 09:51:59 UTC - xue: ```ApplicationContext applicationContext = new
ClassPathXmlApplicationContext("file:G:\\dubbo-consumer.xml");```
function show error: cvc-elt.1: Cannot find the declaration of element 'beans'
----
2020-03-25 10:00:49 UTC - Dennis Yung: I found that the WebSocket client
doesn't support right now, and such support doesn't seem to come soon. Is it
possible to hack it through? eg inject schema version info to the message
manually, and publish the schema through REST api?
----
2020-03-25 14:54:28 UTC - David Kjerrumgaard: @xue I think Spring would be a
bit of overkill for a Pulsar Function IMHO. Longer term, the community needs to
come up with a project similar to
<https://github.com/spring-projects/spring-kafka|spring-kafka> . Pulsar
functions are intended to be lightweight, simple functions that are only a few
lines long.
----
2020-03-25 14:59:39 UTC - David Kjerrumgaard: @Pierre-Yves Lebecq If you are
using the LocalRunner, then you can configure the input schema type as shown in
this snippet.
----
2020-03-25 15:01:42 UTC - Pierre-Yves Lebecq: @David Kjerrumgaard I’m not
unfortunately. Is there a way to do the same thing when using the CLI
“functions create” command?
----
2020-03-25 15:04:44 UTC - David Kjerrumgaard: Yes, there is a
`--custom-schema-inputs` switch you can use to specify this type of
information. <https://pulsar.apache.org/docs/en/functions-cli/#create>
----
2020-03-25 15:06:25 UTC - Pierre-Yves Lebecq: I’ll have a look, thank you. :+1:
----
2020-03-25 16:26:21 UTC - Sijie Guo: @Pierre-Yves Lebecq I think there is an
issue regarding avro. We are fixing that.
----
2020-03-25 16:30:16 UTC - Pierre-Yves Lebecq: @Sijie Guo All right. thanks for
the heads up.
----
2020-03-25 17:19:42 UTC - Ashish Shinde: @Ashish Shinde has joined the channel
----
2020-03-25 17:25:08 UTC - Joel Cressy: Hey, I just found this comment thread on
HN and i’m extremely intrigued: <https://news.ycombinator.com/item?id=21937710>
If anyone knows the OP or can get in contact with them, can we share code
examples? My exact stack is AWS, EKS and hashicorp vault, so i’d love to know
how they glued together auth (piggybacked on aws iam) and various other configs.
----
2020-03-25 17:42:35 UTC - Adam Feldman: Email is in their HN profile:
<https://news.ycombinator.com/user?id=addisonj>
----
2020-03-25 17:48:56 UTC - Addison Higham: hi, that is me :slightly_smiling_face:
tada : Adam Feldman, Sijie Guo
----
2020-03-25 17:49:07 UTC - Addison Higham: @Joel Cressy ^^
----
2020-03-25 17:50:12 UTC - Joel Cressy: Awesome! glad to meet you
----
2020-03-25 17:51:16 UTC - Joel Cressy: i’m just beginning my research into this
and am trying to gather as much info as I can
----
2020-03-25 17:51:29 UTC - Addison Higham: sure thing, let me put some details
under a thread here
100 : Sijie Guo
+1 : Sijie Guo
----
2020-03-25 17:57:19 UTC - Addison Higham: So, the quick 5 minute version:
• we run on EKS, pretty much we started with the manifest definitions in the
pulsar repo but have changed them quite a bit at this point, but nothing too
fancy there, no real magic on an individual pulsar cluster
• We have many AWS accounts and VPCs as my company, so we expose pulsar via the
pulsar proxy to other accounts using privatelink endpoints. We create the NLB
for pulsar proxy just using a k8s service (see
<https://docs.aws.amazon.com/eks/latest/userguide/load-balancing.html>). Then
we have some terraform that simply looks up the service in k8s and then creates
the privatelink endpooint.
• We use a combination of VPC peering, external DNS, and NLBs to run a global
zookeeper on K8S AND to allow for geo replication. I can get into more detail
on this later
----
2020-03-25 18:00:59 UTC - Addison Higham: now for auth: the basic concept is
that we allow our users (via a small CLI tool and a microservice) to create an
association between an IAM role and a pulsar role.
This association is basically stored in vault. Vault allows you to create
policies that are "templated" based on the identity of who you are connect
with. So for example, the following vault policy:
```pulsar/data/client/{{identity.entity.aliases.<id of your aws iam service
auth>.name}}/creds/*```
----
2020-03-25 18:01:16 UTC - Addison Higham: we apply that policy to any roles we
want to trust to any AWS accounts
----
2020-03-25 18:02:44 UTC - Addison Higham: basically, AWS roles that auth
against vault can read from a location that include a unique identifier of
their role. Our little microservice periodically drops creds to that same
location (based on the association a user created)
----
2020-03-25 18:03:05 UTC - Joel Cressy: Are these creds a jwt that pulsar
validates?
----
2020-03-25 18:04:33 UTC - Addison Higham: correct, we just use token based
auth, but we use public/private key tokens so that the brokers/proxy only need
the public key and private key can be limited to just signing tokens. Which we
have a k8s cron job that basically just refreshes the tokens every few hours.
Eventually, we want to write a vault plugin so that tokens are generated on
demand, but this works for now
----
2020-03-25 18:05:47 UTC - Joel Cressy: Yeah, kinda sucks that vault doesn’t
have a jwt secrets engine. You can auth to vault with jwt, but vault won’t
generate/sign jwt’s for you.
----
2020-03-25 18:06:15 UTC - Addison Higham: we have a WIP plugin that we hope to
open source when we get there, just sorta stalled out with lots of other
priorities
----
2020-03-25 18:07:31 UTC - Addison Higham: but anyways, does that make sense? We
have some tooling in our little CLI tool that can do the vault auth and fetch
the token for you. That has made it easier for even teams that aren't totally
up on vault yet to just get started by calling our CLI tool to fetch the token
----
2020-03-25 18:07:58 UTC - Joel Cressy: What kind of TTL is on these credentials?
----
2020-03-25 18:08:55 UTC - Addison Higham: oh and one important detail: the ID
of the AWS role (or any AWS principal really) that vault uses for templating
locations is the `UniqueID` which means that any tooling first needs to resolve
role/user/etc to the unique ID before going to fetch the credential
----
2020-03-25 18:09:19 UTC - Addison Higham: (see this doc for details on that:
<https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_identifiers.html#identifiers-unique-ids>)
----
2020-03-25 18:09:58 UTC - Joel Cressy: Oh, that’s like access key id for
standard iam creds. When you assume role, you have one of these unique ID’s as
the access key id along with secret and session/security token.
----
2020-03-25 18:10:20 UTC - Addison Higham: we let the user decide the TTL within
a range, but right now, defaults to a year. We still have a lot of apps that
aren't native k8s or with an existing vault integration yet. As that evolves,
we want to drop the default TTL down to a day or something
----
2020-03-25 18:10:48 UTC - Joel Cressy: Awesome, yeah user configurable TTL is
great because I also have a lot of things that don’t have native integrations.
----
2020-03-25 18:11:46 UTC - Joel Cressy: BTW, one of the first use cases I intend
to run pulsar for is log ingest/forwarding. I will do traditional pub/sub later
but current goals are to provide interfaces for log forwarders to send data
(e.g. filebeat, fluentd/fluentbit, etc)
----
2020-03-25 18:11:49 UTC - Addison Higham: yeah, obviously not ideal, but since
this is all over internal VPC (privatelink) and we have a plan for apps to be
able to use short lived tokens, that is where we are for now
----
2020-03-25 18:12:10 UTC - Addison Higham: heh, yeah, so I actually just wrote a
fluentbit plugin
----
2020-03-25 18:12:36 UTC - Addison Higham: via the golang interface, haven't
rolled it out at scale yet (next week likely), it isn't open source yet though
----
2020-03-25 18:12:53 UTC - Joel Cressy: I may also need find a way to map
non-aws service identities to pulsar, but with vault the sky’s the limit.
----
2020-03-25 18:14:21 UTC - Addison Higham: yeah, so our little microservice we
have, it can generate creds for users after an okta login. But yes, the same
pattern with vault should work relatively sanely
----
2020-03-25 18:15:19 UTC - Joel Cressy: ooo, yeah we use okta too so i’d have to
do something similar. Users auth to vault with okta, so maybe something could
be done there.
----
2020-03-25 18:16:58 UTC - Arthur: @Arthur has joined the channel
----
2020-03-25 18:21:51 UTC - Arthur: Hey, I try pulsar on Kubernetes with official
helm with only 3 kubernetes nodes. First try, I got affinity/antiaffinity
error. I change replication to 1 for zookeeper, booker and broker but now I
have "ManagedLedgerException: Not enough non-faulty bookies available". Is it
possible to disable antiaffinity ?
<#CJ0FMGHSM|kubernetes>
----
2020-03-25 21:01:15 UTC - Sijie Guo: you need to change the broker config map
to reduce ensemble size to 1.
<https://github.com/apache/pulsar/blob/master/deployment/kubernetes/helm/pulsar/values.yaml#L231>
I don’t think the current helm chart in master support disabling antiaffinity.
If you are looking for a chart doing so, you can try our chart.
<https://github.com/streamnative/charts>
use this values file
(<https://github.com/streamnative/charts/blob/master/examples/pulsar/values-minikube.yaml>)
----
2020-03-26 00:24:34 UTC - Hiroyuki Yamada: Hi, I posted some question regarding
Key_Shared behavior in the mailing list. Maybe my understanding and expectation
are not correct, but it doesn't work as expected so far.
<http://mail-archives.apache.org/mod_mbox/pulsar-users/202003.mbox/browser>
Can anyone help me ?
(BTW, I've tested it with 2.5.0)
----
2020-03-26 00:39:03 UTC - Sijie Guo: Just replied.
----
2020-03-26 01:27:25 UTC - Kannan: Hi, Running Pulsar 2.5.0 with istio 1.5 on
AKS(k8s cluster). Is it possible to replace pulsar proxy with istio ingress ?
----
2020-03-26 02:00:29 UTC - Hiroyuki Yamada: Thank you !
I just replied back.
Hmm, seems like it is not working as expected. So, messages with the same key
go to different consumers even if there is no new consumer joining the
subscription.
----
2020-03-26 02:15:03 UTC - Andy Papia: we are using Keycloak
(<http://keycloak.org|keycloak.org>) for authn/authz in our system. has anyone
though about creating an authentication plugin for keycloak or something
standards-based like OAuth 2?
----
2020-03-26 02:18:02 UTC - Ali Ahmed: @Andy Papia no not yet but it should be
simple to do here is a sample external auth plugin for pulsar
----
2020-03-26 02:18:03 UTC - Ali Ahmed:
<https://github.com/CleverCloud/biscuit-pulsar>
----
2020-03-26 02:19:23 UTC - Andy Papia: thanks, that's helpful
----
2020-03-26 03:22:41 UTC - Kevin Hui: @Kevin Hui has joined the channel
----
2020-03-26 03:43:56 UTC - Sijie Guo: Interesting, @Penghui Li can you help
check this?
man-bowing : Hiroyuki Yamada
----
2020-03-26 03:55:30 UTC - Evan Xu: @Evan Xu has joined the channel
----
2020-03-26 06:25:25 UTC - Amit Aggarwal: @Amit Aggarwal has joined the channel
----
2020-03-26 08:21:32 UTC - Amit Aggarwal: are there any known issues with
ttlDurationDefaultInSeconds config setting in pulsar 2.5.0 ?
After setting it to 604800 (7 days), the messages were being expired
immediately (5 min after producing)
----
2020-03-26 08:52:10 UTC - Sijie Guo: Can you describe the sequence and how do
you verify that?
----