2019-04-09 09:38:00 UTC - Laurent Chriqui: Hello. Any updates on that matter 
@Sijie Guo? Did we get it right? Would there be easier means to implement this ?
----
2019-04-09 12:32:15 UTC - Alexandre DUVAL: Hi, I see only getters in Record, 
Context classes for pulsar function, there is a way to add property to message 
on function processing?
----
2019-04-09 12:34:37 UTC - John Crawford: just fyi, we ended up just downgrading 
to 2.2.1
----
2019-04-09 12:37:23 UTC - Sijie Guo: you mean adding properties to the result 
messages?
----
2019-04-09 12:37:46 UTC - Alexandre DUVAL: Yes
----
2019-04-09 12:39:10 UTC - Sijie Guo: currently it doesn’t support this yet. but 
it should straightforward to adding this feature. do you mind creating an issue 
to pulsar? so the community can pick it up.
----
2019-04-09 12:43:52 UTC - Alexandre DUVAL: 
<https://github.com/apache/pulsar/issues/4009>
----
2019-04-09 13:10:11 UTC - young: by pulsar-client tools, if running the 
producer at first,and then running the consumer,the first message will be 
lost,but the second message will not.
----
2019-04-09 13:14:18 UTC - Sijie Guo: by default, a brand new subscription 
starts from latest messages. so if you want to consume all the messages in a 
topic, we have two options:

a)  create the subscription before messages are published.

b) create the consumer by specifying 
`.subscriptionInitialPosition(SubscriptionInitialPosition.earliest)`.
----
2019-04-09 14:42:01 UTC - Chris DiGiovanni: Does the 
managedLedgerOffloadMaxThreads apply to reads and writes?
----
2019-04-09 15:54:21 UTC - Emma Pollum: Having trouble with the prometheus 
metrics, continually getting this
error while linting: text format parsing error in line 189: second TYPE line 
for metric name "pulsar_subscriptions_count", or TYPE reported after samples
----
2019-04-09 15:54:26 UTC - Emma Pollum: is there a setting I have to change?
----
2019-04-09 15:59:27 UTC - Sijie Guo: what version of prometheus are you using? 
if you are using 1.x version, it might be having some problems on handle 
duplicated metrics. try upgrade to 2.4.x version or above.
----
2019-04-09 16:00:01 UTC - Sijie Guo: or a temp workaround at broker side by 
setting ``exposeTopicLevelMetricsInPrometheus=false``
----
2019-04-09 16:17:38 UTC - Emma Pollum: :thumbsup:  I will try to upgrade.
I did turn off topic level metrics and I'm still getting the error on 
pulsar_topics_count
looks like there is one for the cluster, and one for namespace level
----
2019-04-09 16:22:42 UTC - Emma Pollum: I'm running promtool with v. 2..7.1 and 
still getting the issue.
----
2019-04-09 17:44:19 UTC - Emma Pollum: @Sijie Guo after updating prometheus to 
2.7.1, prometheus cannot scrape if I have topic level metrics on. I need those 
metrics though to see subscription backlog counts... do you know of a 
workaround or a fix that is coming soon&gt;?
----
2019-04-09 19:35:09 UTC - Devin G. Bost: @Grant Wu Do you have links to any of 
the documentation for the k8s tooling that you're referring to (that overlaps 
with my proposal for the manifest approach)?
----
2019-04-09 19:35:43 UTC - Grant Wu: So the tool that we use is 
<https://helm.sh/docs/helm/>
----
2019-04-09 19:35:56 UTC - Devin G. Bost: Thanks!
----
2019-04-09 19:37:55 UTC - Devin G. Bost: Is there a YAML file somewhere for 
deploying Pulsar components via Helm?
----
2019-04-09 19:38:08 UTC - Kenan Dalley: Is there an example of a kafka source 
ingestion?  I've been working to consume from an existing SSL-based Kafka 
cluster, and am running into issues.  It looks like I may have finally got the 
source to connect to my cluster correctly, but no pulsar function was created.  
I see no errors in either the pulsar log or the function log, and actually see 
the,  but when I do a pulsar-admin functions list, nothing shows up.  
pulsar-admin source list shows my source.
----
2019-04-09 19:38:53 UTC - Grant Wu: 
<https://github.com/apache/pulsar/tree/master/deployment/kubernetes/helm> 
Pulsar supplies helm charts for Pulsar here
----
2019-04-09 19:39:10 UTC - Devin G. Bost: I see YAML files for the broker, 
bookkeeper, grafana, prometheus, and related services, but I don't see anything 
for sinks or sources (or functions).
----
2019-04-09 19:39:40 UTC - Grant Wu: I don’t know anything about sinks or 
sources.  Functions being missing is what I was referring to when I said 
“Pulsar Functions are the exception here because they’re not materialized as 
k8s resources anywhere”
----
2019-04-09 19:40:12 UTC - Devin G. Bost: Gotcha. What about tenants or 
namespaces?
----
2019-04-09 19:40:26 UTC - Grant Wu: Those aren’t materialized as k8s resources 
either :confused:
----
2019-04-09 19:40:38 UTC - Devin G. Bost: Okay. It sounds like there's no 
overlap then.
----
2019-04-09 19:40:38 UTC - Grant Wu: Probably want to talk to @Matteo Merli to 
see what sort of roadmap/plans they have
----
2019-04-09 19:40:45 UTC - Devin G. Bost: Good point.
----
2019-04-09 19:41:00 UTC - Grant Wu: So when you said “components” you weren’t 
referring to the broker/bookeeper/etc.?
----
2019-04-09 19:41:13 UTC - Devin G. Bost: Correct.
----
2019-04-09 20:10:06 UTC - Ryan Samo: Has anyone attempted to calculate the time 
difference between when messages are produced to when they are consumed? Trying 
to figure if there is any built in metrics for this or if I need to roll my 
own. All I can think of is starting a reader and measuring the difference  from 
the message produced time stamp to the time of read. Any suggestions or other 
ways this might be calculated internally? Even if it’s the produce to consume 
ratio every minute that would be enough for my use case.
----
2019-04-09 20:11:19 UTC - Kenan Dalley: I'm basing this on the Cassandra sink 
tutorial from the website which says that a function should be created.
----
2019-04-09 20:13:03 UTC - Ali Ahmed: are you using the kafka source ?
----
2019-04-09 20:15:12 UTC - Kenan Dalley: Yes, I'm using the 
pulsar-io-kafka.2.3.0.nar that I pulled from the website.  It's the only 
connector that I have in the "connectors" folder currently.
----
2019-04-09 20:17:32 UTC - Ali Ahmed: don’t think someone there is an example it 
is uses in the integration tests you can take a look at that
----
2019-04-09 20:18:08 UTC - Kenan Dalley: I assume that's in the source?
----
2019-04-09 20:19:56 UTC - Ali Ahmed: yes
----
2019-04-09 20:20:28 UTC - Kenan Dalley: Ok, I'll take a look there.  Thanks.
----
2019-04-09 20:20:30 UTC - Ali Ahmed: it’s using docker to to model e2e scenarios
----
2019-04-09 20:32:50 UTC - Ryan Samo: Settled on using a reader to compare times 
but please feel free if anyone has better ideas! :+1:
----
2019-04-10 00:14:47 UTC - Matteo Merli: Functions and IO connectors (which are 
based on functions) are not instantiated through the Helm chart. Rather the 
worker service (or broker) will just create a K8S deployment when the function 
is created
----
2019-04-10 00:15:48 UTC - Grant Wu: Oh, I didn't realize it created a k8s 
deployment
----
2019-04-10 00:17:04 UTC - Matteo Merli: Yes, it depends on the deployment mode 
(thread, process or K8S)
----
2019-04-10 02:39:26 UTC - Steve Kim: Does anyone have guidance on configuring 
pulsar to improve read performance when reading from segments that have been 
offloaded to object storage (e.g. S3)? Of course reading from object storage 
will be slower than reading from bookies. However, I am surprised by how slow 
it is, when I compare to reading data directly from cloud storage without going 
through pulsar.
----
2019-04-10 02:40:56 UTC - Steve Kim: I see that there is a configuration 
parameter  `s3ManagedLedgerOffloadReadBufferSizeInBytes`. Are there other 
relevant configuration parameters? Should I be adjusting the size of the 
offloaded segments?
----
2019-04-10 02:57:42 UTC - Sanjeev Kulkarni: @Ivan Kelly @jia zhai might be able 
to help you @Steve Kim
----
2019-04-10 03:14:01 UTC - jia zhai: @Steve Kim, you are right, 
s3ManagedLedgerOffloadReadBufferSizeInBytes is the parameter.
----
2019-04-10 06:17:58 UTC - Olivier Chicha: Hello,
Is there a way to have a sticky distribution with pulsar?
The idea is to have a shared subscription, but instead of having a round robin 
distribution, I would like that each message is distributed based on a hash 
function (on one of the property of the message)
I think that it is feasible with Kafka through partitions
but I don't see how this this can be achieved via Pulsar, we initially thought 
that we could use the Pulsar partitioned topic as well, but after re reading 
the doc we realized that it is not the case (from our understanding)
----
2019-04-10 06:23:25 UTC - Ali Ahmed: @Olivier Chicha you want a specific 
message to go into a specific partition ?
----
2019-04-10 06:30:59 UTC - Matteo Merli: @Olivier Chicha A failover subscription 
on a partitioned topic will achieve the same
----
2019-04-10 06:56:14 UTC - Olivier Chicha: @Matteo Merli Thanks a lot for your 
answer.
So if I create a failover subscription on a partitioned topic, it will not be 
the same consumer that will be the master on each partition?
I thought that it would be the same for each partition based on what is written 
in the documentation of "failover subscription"
"In failover mode, multiple consumers can attach to the same subscription. The 
consumers will be lexically sorted by the consumer's name and the first 
consumer will initially be the only one receiving messages. This consumer is 
called the master consumer"
How are master distributed over the partitions ?
Is there a documentation about it?
----
2019-04-10 07:02:41 UTC - Matteo Merli: Uhm.. apparently the javadocs 
publishing on website got stuck some time back.

Take a look at:
<https://github.com/apache/pulsar/blob/c79fd728cf27417ca117ca220dd07dc4319d4c46/pulsar-client-api/src/main/java/org/apache/pulsar/client/api/SubscriptionType.java#L46>
----
2019-04-10 07:04:11 UTC - Matteo Merli: &gt; How are master distributed over 
the partitions ?

Brokers will pick active consumers such that partitions will be evenly 
distributed among available consumers
----
2019-04-10 07:15:16 UTC - Olivier Chicha: Great this was really a critical 
point for us.
FYI the doc I was refering to is : 
<https://pulsar.apache.org/docs/en/concepts-messaging/#failover>
Is there a way for a consumer to know :
- on which partition he is the master ?
- that he has become a master or a slave for a partition, or that simply the 
"distribution" on a topic has changed?
----
2019-04-10 07:16:51 UTC - Matteo Merli: yes, take a look at 
<https://pulsar.apache.org/api/client/org/apache/pulsar/client/api/ConsumerBuilder.html#consumerEventListener-org.apache.pulsar.client.api.ConsumerEventListener->
----
2019-04-10 07:19:31 UTC - Matteo Merli: &gt; FYI the doc I was refering to is : 
<https://pulsar.apache.org/docs/en/concepts-messaging/#failover>

Yes, we really need to clarify this in the docs
----
2019-04-10 07:32:12 UTC - Olivier Chicha: Ok, so the ConsumerEventListener will 
allow me to be notified of the changes : This is great and that should be 
enough for us
As far as I understand, there is no way to get directly the list of the 
partitions on which my consumer is master for a given topic, is that correct?
----
2019-04-10 07:37:54 UTC - Olivier Chicha: the message from merlimat in the 
thread answered my question thanks
----
2019-04-10 07:40:46 UTC - Matteo Merli: No, but when you create the consumer 
you’ll get all the notifications 
----
2019-04-10 07:40:58 UTC - Olivier Chicha: On a totally different subject:
Do you know if there is any plan / project to provide an Elixir / Erlang 
implementaton of the pulsar client ?
----
2019-04-10 07:41:34 UTC - Matteo Merli: Also you can check the topic stats to 
see which one is active/inactive 
----
2019-04-10 07:42:41 UTC - Olivier Chicha: Great, thank you very much for all 
your answers.
----
2019-04-10 08:07:22 UTC - Thor Sigurjonsson: I noticed this problem a few days 
ago and mentioned it to @Matteo Merli. I started working on a fix but got side 
tracked with my regular day job :slightly_smiling_face:
I should have a commit later this week.
----
2019-04-10 08:08:51 UTC - Sijie Guo: cool :+1:
----
2019-04-10 08:32:39 UTC - Kev Jackson: morning - so I'm moving towards building 
a pulsar cluster after testing with a single instance
----
2019-04-10 08:33:26 UTC - Kev Jackson: and I'm starting with the docs that 
suggest building the Zookeeper cluster first - reading the zookeeper docs 
suggests something interesting
----
2019-04-10 08:34:23 UTC - Kev Jackson: "ZooKeeper runs in Java, release 1.8 or 
greater (JDK 8 or greater, FreeBSD support requires openjdk8). It runs as an 
ensemble of ZooKeeper servers. Three ZooKeeper servers is the minimum 
recommended size for an ensemble, and we also recommend that they run on 
separate machines. At Yahoo!, ZooKeeper is usually deployed on dedicated RHEL 
boxes, with dual-core processors, 2GB of RAM, and 80GB IDE hard drives"  - is 
this still suggested sizing for zookeeper for a pulsar cluster
----
2019-04-10 08:45:03 UTC - Ivan Kelly: @Kev Jackson it depends on the load on 
the cluster, but that should be enough for most cases
----
2019-04-10 08:47:18 UTC - Ali Ahmed: @Olivier Chicha there is current plans a 
wrapper over c++ would be the way to go, it just depends on the community demand
----

Reply via email to