Slack digest for #general - 2019-08-27

Apache Pulsar Slack Tue, 27 Aug 2019 02:11:36 -0700

2019-08-26 09:15:33 UTC - Ali Ahmed: @Emanuel what’s your use case for that ?
----
2019-08-26 09:36:23 UTC - Poule: the POST to create a new function gives me a 
415 Unsupported Media type, what am I missing
----
2019-08-26 09:55:36 UTC - tuteng: Please set http header headers: { 
'Content-Type': 'multipart/form-data' },  or headers: { 'Content-Type': 
'application/json' },
----
2019-08-26 10:52:34 UTC - Emanuel: @Ali Ahmed I need additional historical data 
in my stream processing to choose the sink topic . For that I need to query 
historical events in different topics 
----
2019-08-26 10:53:11 UTC - Emanuel: I can find a lot of examples about pulsar 
sql but none of the are practical nature 
----
2019-08-26 16:34:17 UTC - Addison Higham: hrm... in regards to 
<https://github.com/apache/pulsar/issues/4448>, does that mean it should mostly 
just be bumping the dependencies? And if not, can I uses the non-bundled 
zookeeper and still have it work with 3.5? For my global ZK, I really want to 
use TLS for server-server communication. Trying to figure out easiest way there
----
2019-08-26 17:09:59 UTC - Raman Gupta: Thanks, I didn't wait that long. Good to 
know.
----
2019-08-26 17:10:48 UTC - Raman Gupta: I also wrongly assumed that when I 
explicitly refreshed. the dashboard would get the latest stats.
----
2019-08-26 17:49:11 UTC - Raman Gupta: Regarding the backlog metric, is there a 
way to differentiate between old messages that have not been acked vs new 
messages that have not been acked? The former may indicate a poison-pill issue 
or similar problem that needs to be looked into, whereas the latter is probably 
just ongoing normal processing.
----
2019-08-26 17:49:26 UTC - Raman Gupta: Couple more questions as I am updating 
the kafka migration doc 
(<https://docs.google.com/document/d/11lw2cFABwZvqHi-l20Zm2fe1BsQ2F6D5MzxFwbBuN5Y/edit>)
 I have been working on:
1) if I want to do things like combining streams together via a streams 
framework, I would have to use something like Flink. Heron, or Spark right?
2) Is there any plan to create / contribute a Beam I/O transform for Pulsar 
(<https://beam.apache.org/documentation/io/built-in/>)
----
2019-08-26 17:51:47 UTC - Matteo Merli: It should be. As mentioned, we found a 
couple of compat issues in ZK 3.5 that would prevent the ability to rollback to 
3.4, though we fixed them through AOJ hacks
----
2019-08-26 17:52:58 UTC - Sijie Guo: if you use `topics stats-internal` to 
retrieve the internal stats per topic, you can see the last mark delete 
position. that can probably help you understand if your cursor is moving or 
not, which can indicate if there is a poision-pill issue.


Also you can use a Dead-Letter-Topic to handle poison-pill issues.
+1 : Raman Gupta
----
2019-08-26 17:53:00 UTC - Addison Higham: it appears to be building... mostly I 
am debating if I should ship my own docker image with using embedded zk 3.5 or 
roll my own... should be really easy to backport onto 2.4 and just get that one 
change
----
2019-08-26 17:54:18 UTC - Matteo Merli: The only reason we rolled back 3.5 
-&gt; 3.4 was because of the `-beta` tag in the version
----
2019-08-26 17:56:31 UTC - Addison Higham: I noticed a few other code changes 
with the downgrade (or maybe I am getting confused with the BK change) but 
didn't trace it all down. I can confirm it at least builds, running tests now. 
Will submit a CR for it against master and then backport to my own 2.4 branch, 
I don't imagine you want that for 2.4 release?
----
2019-08-26 17:57:11 UTC - Matteo Merli: Yes, that would go for 2.5, only 
bugfixes for 2.4.1
----
2019-08-26 17:57:16 UTC - Sijie Guo: &gt;  if I want to do things like 
combining streams together via a streams framework, I would have to use 
something like Flink. Heron, or Spark right?

If your processing logic is simple enough, Pulsar Functions is probably a good 
candidate to explore.

Otherwise, you might have to consider a streaming framework like Flink or Spark.

Also FYI we are working a feature called  KoP (Kafka-on-Pulsar), which aims at 
providing Kafka compatibility at protocol level. It means if you already have 
KStream applications, you can seamlessly use Pulsar without rewriting your 
kstream applications.

&gt; 2) Is there any plan to create / contribute a Beam I/O transform for 
Pulsar (<https://beam.apache.org/documentation/io/built-in/>)

We are working on a Beam connector.
----
2019-08-26 17:57:44 UTC - Matteo Merli: most of the rest of the changes in that 
commit was to ensure the downgrade to zk-3.4 was sucessfull
----
2019-08-26 17:59:11 UTC - Raman Gupta: Wow, running KStream apps as-is would be 
super-cool
----
2019-08-26 17:59:30 UTC - Raman Gupta: Would that support KStreams exactly-once 
processing?
----
2019-08-26 17:59:37 UTC - Raman Gupta: As well as state stores?
----
2019-08-26 18:01:28 UTC - Sijie Guo: The aim is KStream application remains 
as-is. All the features such as state stores will be supported.

But initial version will only not support exactly-once because KStream 
exactly-once requires transaction support. But we will do after we complete the 
transaction support for Pulsar in 2.5.0
----
2019-08-26 18:02:45 UTC - Raman Gupta: Amazing, thanks. Is there a PIP/issue I 
can track for this support?
----
2019-08-26 18:05:38 UTC - Sijie Guo: &gt; Is there a PIP/issue I can track for 
this support?

we have a demo presented in Pulsar meetup the week before. We are working on 
cleaning up the implementation and will send out a PIP this week or so.
----
2019-08-26 18:06:34 UTC - Sijie Guo: Kafka clients and Kafka connect work well. 
KStream application partially work (except the transaction part).
+1 : Raman Gupta
----
2019-08-26 18:06:59 UTC - Rajiv Abraham: @Rajiv Abraham has joined the channel
----
2019-08-26 18:16:22 UTC - Rajiv Abraham: Hi, I just discovered Pulsar and find 
it very cool. I had a few small questions
1) About debezium integration. Do you support all the configuration settings 
for Debezium? For e.g, I'm interested in 
PostgreSQL(<https://debezium.io/docs/connectors/postgresql/#connector-properties>).
 There are some settings like `column.blacklist`, `schema.blacklist` that are 
not seen at <http://pulsar.apache.org/docs/en/2.3.0/io-cdc-debezium/>.   Does 
that mean it is not supported or just not documented?
2) What version of python is supported on Pulsar Function which is very cool.
3) I just wanted to confirm that there is no python api lib for admin 
functions(only REST api)
4) Is there a way to selectively choose columns on tables for CDC in pulsar?
----
2019-08-26 18:19:13 UTC - Sijie Guo: &gt; 1.
 Currently we support PostgresSQL and MySQL. All the settings are supported. 
(They just are documented).

Technically we support all DB that debezium supports. But you just need to pull 
the corresponding dependencies and build the connector.


&gt;  What version of python is supported on Pulsar Function which is very cool.

both python2 and python3

&gt; I just wanted to confirm that there is no python api lib for admin 
functions(only REST api)

Currently there is no python api lib yet.

&gt; Is there a way to selectively choose columns on tables for CDC in pulsar?

Currently no. but it should be trivial to add this feature by adding a 
projection function to the connector.
----
2019-08-26 18:22:40 UTC - Rajiv Abraham: @Sijie Guo Thanks!
when you say python3, is it always the latest e.g. python 3.7, do you upgrade 
automatically or can we specify the python version.
----
2019-08-26 18:23:24 UTC - Sijie Guo: I think it works for 3.7.

you can specify the python version
----
2019-08-26 18:29:03 UTC - Rajiv Abraham: nice! Thank you.
----
2019-08-26 18:42:46 UTC - Poule: the python in docker image is 3.5
----
2019-08-26 18:43:34 UTC - Poule: <https://github.com/apache/pulsar/issues/4944>
----
2019-08-26 18:47:21 UTC - Poule: @Sijie Guo when you say we can specify python 
version you mean in the shebang #!python3.7 or elsewhere?
----
2019-08-26 18:49:04 UTC - Sijie Guo: sorry I mean you can specify the python 
version in your broker / function worker machines. (if you are running docker, 
it means installing the python version in the docker image).
----
2019-08-26 18:49:47 UTC - Sijie Guo: it is an environment setting. the python 
version is installed by the administrator who deployed the cluster.
----
2019-08-26 18:56:42 UTC - Rajiv Abraham: ah ok, thanks @Poule and @Sijie Guo 
for clarifying it further.
----
2019-08-26 19:22:41 UTC - Ali Ahmed: @Emanuel That won’t be a good fit presto 
queries can take a while it does’t fit the middle of low latency computing of 
pulsar functions.
+1 : Emanuel
----
2019-08-26 19:28:26 UTC - Ali Ahmed: You probably want a external service that 
builds a cache for the functions to use , you can use the pulsar functions 
state for that cache
----
2019-08-26 19:52:28 UTC - Raman Gupta: Can the number of partitions for a 
Pulsar topic be changed after creation? If not, how best to "migrate" a topic, 
and its subscriptions, from a non-partitioned or N-partitioned topic to an 
M-partitioned topic?
----
2019-08-26 20:03:37 UTC - Chris Bartholomew: Yes, the number of partitions can 
be updated. There is more info here: 
<https://pulsar.apache.org/docs/en/admin-api-partitioned-topics/#update>
+1 : Raman Gupta, Ali Ahmed
----
2019-08-26 20:44:48 UTC - Igor Zubchenok: @Matteo Merli @Sijie Guo could you 
help please?
----
2019-08-26 21:58:27 UTC - Matteo Merli: Taking a look
+1 : Igor Zubchenok
----
2019-08-26 22:53:39 UTC - Igor Zubchenok: any ideas?
----
2019-08-27 00:47:39 UTC - Jacob: @Jacob has joined the channel
----
2019-08-27 05:58:52 UTC - Kim Christian Gaarder: In Pulsar 2.4 it’s possible to 
seek on message publish time. Is there also a way to populate a topic with data 
and control the publish time or is this always a timestamp set by the broker at 
publish-time? I ask because I want to be able to perform backup/restore like 
functionality on topics without loosing the original publish-times.
----
2019-08-27 06:38:39 UTC - ivalue2333: @ivalue2333 has joined the channel
----

Slack digest for #general - 2019-08-27

Reply via email to