Acknowledge contributors for each release

2022-11-20 Thread tison
Hi,

Dave's email about first-time contributions reminds me of the idea to
acknowledge contributors for each release.

Let's see how the Flink community does it:

For each release blog, e.g. [1], in the last section "List of Contributors",
explicitly list out contributors and acknowledge their contributions.

When it went back to the first time I got listed, I was proud to share it
on social media and it encouraged me a lot to make more contributions.
Also, from the community perspective, releasing patches as well as
acknowledging the authors are somewhat fundamental rewards we can offer.

To generate the list, I adapt the scripts provided by Flink as:

1. For Pulsar v2.10.0 (minor version release):

  git log --pretty="%an%n%cn" v2.9.0..v2.10.0 | sort | uniq | tr "\n" "," |
sed 's/,/, /g'

... which gives:

Addison Higham, Ali Ahmed, Aloys, Amar Prakash Pandey, Andras Beni, Andrey
Yegorov, AnonHxy, Anonymitaet, Arnar, Baozi, Bharani Chadalavada, Bowen Li,
Boyang Jerry Peng, Callum Duffy, Christophe Bornet, Da Xiang Huang, Dave
Fisher, David Kjerrumgaard, Devin Bost, Dezhi LIiu, Dianjin Wang, Diego,
Enrico Olivelli, Eric Shen, Eron Wright, Fangbin Sun, Frank J Kelly,
Frederic Kneier, Gautier DI FOLCO, GitHub, Haaroon Y, Hang Chen,
HuangQiang, Huanli Meng, Jagadesh Adireddi, Jason918, JiangHaiting, Jin,
Jiwei Guo, Kai, Kai Wang, Koen Rutten, Lakshmi Balu, Lari Hotari, Lars
Hvam, Lei Zhiyuan, Li Li, Lishen Yao, Marvin Cai, Masahiro Sakamoto,
Massimiliano Mirelli, Matt Fleming, Matteo Merli, Md Mostafijur Rahman,
Michael Marshall, Neng Lu, Nicklee007, Nicolò Boschi, Ofek Lev, Paul Gier,
Peter Tinti, Qiang Huang, Qiang Zhao, Rajan Dhabalia, Roc Marshal, Ruguo
Yu, Rui Fu, Saumitra Srivastav, Shen Liu, Sijie Guo, Smile, TakaHiro, Tao
Jiuming, Thomas Leplus, Tong, Travis Sturzl, Vincent Royer, WangJialing,
Xiangying Meng, Xiaobing Fang, Xiaoyu Hou, YANGLiiN, Yan, Yang Yang,
Yannick Koechlin, Yong Zhang, Yunze Xu, Yuri Mizushima, Yuto Furuta, Zach
Walsh, ZhangJian He, Zhanpeng Wu, Zhiwu Wang, Zike Yang, Zixuan Liu, Ziyao
Wei, aarondonwilliams, baomingyu, bentonliang, billowqiu, chenlin,
codertmy, congbo, entvex, fengtao1998, feynmanlin, fu-turer, gaozhangmin,
goflutterjava, hanmz, hrsakai, imryao, junqingzh, kaushik-develop,
kijanowski, kimula takesi, lightzhao, lin chen, lipenghui, litao,
liuchangqing, liudezhi, madhavan-narayanan, ming, mingyifei, momo-jun,
penghui, ran, sijia-w, suiyuzeng, wenbingshen, xiaolong ran, youzipi,
zhaoyajun2009, 包子, 萧易客

2. For Pulsar v2.10.1 (patch version release):

  git log --pretty="%an%n%cn" v2.10.0..v2.10.1 | sort | uniq | tr "\n" ","
| sed 's/,/, /g'

... which gives:

Adrian Paul, AlvaroStream, Andrey Yegorov, Baodi Shi, Baozi, Christophe
Bornet, Cong Zhao, Dezhi LIiu, Enrico Olivelli, GitHub, JiangHaiting, Jim
Baugh, Jiwei Guo, Kai Wang, Kay Johansen, Lari Hotari, LinChen, Lishen Yao,
Matt-Esch, Matteo Merli, Michael Marshall, Neng Lu, Nicolò Boschi, Qiang
Huang, Qiang Zhao, Ruguo Yu, Rui Fu, Shen Liu, Tao Jiuming, Tian Luo,
WangJialing, Xiangying Meng, Xiaoyu Hou, Yan Zhao, Yang Yang, Yong Zhang,
Yunze Xu, Yuri Mizushima, ZhangJian He, Zike Yang, Zixuan Liu, boatrainlsz,
codertmy, congbo, dependabot[bot], fengyubiao, gaoran10, gaozhangmin,
grayson, lin chen, lipenghui, lixinyang, llIlll, penghui, ran, wuxuanqicn,
赵延,

I think we can integrate such a step into the release process[2] if we
reach a consensus. But let me start this thread first to see your thoughts
and suggestions.

Looking forward to your feedback!

Best,
tison.

[1] https://flink.apache.org/news/2022/10/28/1.16-announcement.html
[2] https://pulsar.apache.org/contribute/release-process


Re: Identifying contributions from new contributors

2022-11-20 Thread tison
Hi Dave,

Interesting. I found @labuladong's related proposal[1] days before.

There can be two ways to identify such contributions:

1. Running CI tasks to label them when a new issue/PR is opened. Existing
actions[2] provide the functionality to comment but we can easily extend it
to label.
2. Periodically collect opened first-time contributions with a script, and
perhaps send it to the dev@ mailing list.

Hope this helps :)

Best,
tison.

[1] https://github.com/apache/pulsar-test-infra/pull/83
[2] https://github.com/actions/first-interaction


Dave Fisher  于2022年11月21日周一 14:04写道:

> Hi -
>
> Is it possible to identify PRs and issues from people who are new to
> Pulsar? If so then it might be good to maintain labels.
>
> Knowing and helping new contributors to pulsar benefits us all.
>
> Regards,
> Dave
>
> Sent from my iPhone
>


Identifying contributions from new contributors

2022-11-20 Thread Dave Fisher
Hi -

Is it possible to identify PRs and issues from people who are new to Pulsar? If 
so then it might be good to maintain labels.

Knowing and helping new contributors to pulsar benefits us all.

Regards,
Dave

Sent from my iPhone


Re: [ANNOUNCE] New Committer: Cong Zhao

2022-11-20 Thread Zixuan Liu
Congrats! Cong

Best Regards,
Zixuan

houxiaoyu  于2022年11月21日周一 12:24写道:

> Congrats! Cong
>
> Best,
> Xiaoyu Hou
>
> Haiting Jiang  于2022年11月21日周一 12:10写道:
>
> > The Project Management Committee (PMC) for Apache Pulsar has invited
> > Cong Zhao (https://github.com/coderzc)
> > to become a committer and we are pleased to announce that he has
> accepted.
> >
> > Being a committer enables easier contribution to the
> > project since there is no need to go via the patch
> > submission process. This should enable better productivity.
> >
> > Welcome and congratulations, Cong Zhao!
> >
> > Please join us in congratulating and welcoming Cong Zhao onboard!
> >
> > Best Regards,
> > Haiting on behalf of the Pulsar PMC
> >
>


Re: [ANNOUNCE] New Committer: Cong Zhao

2022-11-20 Thread houxiaoyu
Congrats! Cong

Best,
Xiaoyu Hou

Haiting Jiang  于2022年11月21日周一 12:10写道:

> The Project Management Committee (PMC) for Apache Pulsar has invited
> Cong Zhao (https://github.com/coderzc)
> to become a committer and we are pleased to announce that he has accepted.
>
> Being a committer enables easier contribution to the
> project since there is no need to go via the patch
> submission process. This should enable better productivity.
>
> Welcome and congratulations, Cong Zhao!
>
> Please join us in congratulating and welcoming Cong Zhao onboard!
>
> Best Regards,
> Haiting on behalf of the Pulsar PMC
>


[ANNOUNCE] New Committer: Cong Zhao

2022-11-20 Thread Haiting Jiang
The Project Management Committee (PMC) for Apache Pulsar has invited
Cong Zhao (https://github.com/coderzc)
to become a committer and we are pleased to announce that he has accepted.

Being a committer enables easier contribution to the
project since there is no need to go via the patch
submission process. This should enable better productivity.

Welcome and congratulations, Cong Zhao!

Please join us in congratulating and welcoming Cong Zhao onboard!

Best Regards,
Haiting on behalf of the Pulsar PMC


[GitHub] [pulsar] Raunak-Agrawal added a comment to the discussion: Not able to run pulsar function locally

2022-11-20 Thread GitBox


GitHub user Raunak-Agrawal added a comment to the discussion: Not able to run 
pulsar function locally

@Jason918 can you please help with this?

GitHub link: 
https://github.com/apache/pulsar/discussions/18552#discussioncomment-4191516


This is an automatically sent email for dev@pulsar.apache.org.
To unsubscribe, please send an email to: dev-unsubscr...@pulsar.apache.org



Re: Data quality problem

2022-11-20 Thread PengHui Li
Hi, Devin

Thanks for raising the great discussion. It looks like the salient point is
that Pulsar
doesn't support native JSON schema. Instead, the schema is defined in the
Avro
standard but serialized to JSON format.JSON Schema combines aspects of
type-based and rule-based. As this article[1] said "JSON Schema combines
aspects of both a grammar-based language and a rule-based one". But the
Avro
schema definition only has the aspect of grammar-based.

[1] https://yokota.blog/2021/03/29/understanding-json-schema-compatibility/

> One of the issues with Pulsar's current implementation of schemas for JSON
is the requirement to always have a POCO or some kind of type builder to
construct the schema. This requirement can be cumbersome for users who only
care about a few fields on the object.

Sorry. I do not fully understand here. Is it also related to the "data
quality" problem
that we discussed? For the consumer side, we can use the AUTO_CONSUME schema
to receive GenericObject (For JSON schema, you can deal with JsonObject
directly).
For the producer side, I think yes. We can either send an Object or bytes[]
(AUTO_PRODUCE).

> Plus, the use case is a little different compared to a DLQ or Retry
topic because we'd like a way to handle content failures separately from
other kinds of failures.

Yes, I agree. It's not a field of DLQ.

Thanks,
Penghui

On Thu, Nov 17, 2022 at 7:37 AM Devin Bost  wrote:

> I appreciate all the thoughts and questions so far.
>
> One of the issues with Pulsar's current implementation of schemas for JSON
> is the requirement to always have a POCO or some kind of type builder to
> construct the schema. This requirement can be cumbersome for users who only
> care about a few fields on the object.
> Protobuf attempts to simplify the implementation of the mapping (from data
> to class) by having a language-independent mechanism for defining the data
> (so the POCO can be generated in the desired language), but obviously, that
> offers very few benefits for JSON. Additionally, protobuf and Avro don't
> provide a way to express constraints on data *values*. Consider an example.
> Let's say a site is sending messages like this:
> {
> "user": "bob",
> "action": "click",
> "trackingUrn": "urn:siteA:homepage:topNavigation:0.124",
> "payload" : {
>. . .
>[ *highly nested or dynamic data* ]
>. . .
>   }
> }
>
> Here are some issues we might run into:
> 1. A consumer wants to take action on messages based on a single field.
> They only care about if the field exists and has an allowed value. They
> don't want to spend a week trying to map each of the nested fields into a
> POCO and then worry about maintaining the POCO when nested sub-fields are
> updated by upstream teams with breaking changes. Consider these use cases:
>- Validate that the "action" value is oneOf: [ "click", "impression",
> "hover"]. Route content based on the action unless it's an unexpected
> value.
>- Subfields change depending on the trackingUrn values.
> Consider the following:
>A) In the validation use case, the app developer shouldn't need to deal
> with any fields other than "action", but they should be able to express or
> verify that "action" is part of a data contract they have agreed to consume
> from.
>B) Every app like this would need to add its own runtime validation
> logic, and when many different apps are using their own versions of
> validation, the implementations are brittle and become hard to maintain.
> The solution to the brittleness is to adopt a standard that solves the
> interoperability problem.
>C) If subfields are dynamic, well, there's not a good way to express
> that in Avro. Maybe the developer could use maps, but I think that defeats
> the purpose.
> 2. We should be able to compose schemas from shared "schema components" for
> improved reusability. (Consider it like object-oriented schema design.)
> JSON Schema makes this possible (see detailed write-up here
> <
> https://json-schema.org/blog/posts/bundling-json-schema-compound-documents
> >)
> but Avro does not, so Avro schemas end up with duplication everywhere, and
> this duplication is burdensome for developers to maintain. Consequently,
> some developers avoid using schemas entirely, but that has its own
> consequences.
> 3. If a message's content is invalid, send the message to an "invalid
> message topic".  Since the concerns above are mostly around data content at
> runtime, Avro doesn't help us here, but for JSON content, JSON Schema's
> validation spec
> <
> https://json-schema.org/draft/2020-12/json-schema-validation.html#name-overview
> >
> could. Plus, the use case is a little different compared to a DLQ or Retry
> topic because we'd like a way to handle content failures separately from
> other kinds of failures.
>
> (I'm sure I can think of more examples if I give it more thought.)
>
> Devin G. Bost
>
>
> On Wed, Nov 16, 2022 at 6:36 AM 丛搏  wrote:
>
> > hi, Devin:
> > the first Kafka doesn't support 

Re: [DISCUSS] Remove "triage" series labels?

2022-11-20 Thread Dave Fisher
Thanks for taking care of labels!

Sent from my iPhone

> On Nov 20, 2022, at 5:25 PM, tison  wrote:
> 
> FYI - these labels are removed now.
> 
> FWIW, I'm doing this with GitHub CLI:
> 
> for i in {1..53}; do
>  gh label delete "triage/week-$i" --confirm
> done
> 
> Best,
> tison.
> 
> 
> houxiaoyu  于2022年11月21日周一 09:22写道:
> 
>> +1
>> 
>> Best,
>> Xiaoyu Hou
>> 
>> tison  于2022年11月17日周四 23:32写道:
>> 
>>> Hi,
>>> 
>>> When squashing issue backlog, I noticed there're a number of "legacy"
>> tags
>>> in series "traige/week-x".
>>> 
>>> It seems the initiative has been halted and those labels don't provide
>>> continuous value. Thus I suggest we remove "triage" series labels to
>> reduce
>>> engineering debt.
>>> 
>>> What do you think?
>>> 
>>> Best,
>>> tison.
>>> 
>> 



Re: [DISCUSS] Remove "triage" series labels?

2022-11-20 Thread tison
FYI - these labels are removed now.

FWIW, I'm doing this with GitHub CLI:

for i in {1..53}; do
  gh label delete "triage/week-$i" --confirm
done

Best,
tison.


houxiaoyu  于2022年11月21日周一 09:22写道:

> +1
>
> Best,
> Xiaoyu Hou
>
> tison  于2022年11月17日周四 23:32写道:
>
> > Hi,
> >
> > When squashing issue backlog, I noticed there're a number of "legacy"
> tags
> > in series "traige/week-x".
> >
> > It seems the initiative has been halted and those labels don't provide
> > continuous value. Thus I suggest we remove "triage" series labels to
> reduce
> > engineering debt.
> >
> > What do you think?
> >
> > Best,
> > tison.
> >
>


Re: [DISCUSS] Remove "triage" series labels?

2022-11-20 Thread houxiaoyu
+1

Best,
Xiaoyu Hou

tison  于2022年11月17日周四 23:32写道:

> Hi,
>
> When squashing issue backlog, I noticed there're a number of "legacy" tags
> in series "traige/week-x".
>
> It seems the initiative has been halted and those labels don't provide
> continuous value. Thus I suggest we remove "triage" series labels to reduce
> engineering debt.
>
> What do you think?
>
> Best,
> tison.
>


Re: [DISCUSS] Remove "triage" series labels?

2022-11-20 Thread mattison chao
+1

Thanks,
Mattison

On Sat, 19 Nov 2022 at 15:07, Michael Marshall  wrote:

> I agree with removing these labels.
>
> Thanks,
> Michael
>
> On Thu, Nov 17, 2022 at 9:29 PM Yunze Xu 
> wrote:
> >
> > +1
> >
> > Thanks,
> > Yunze
> >
> > On Fri, Nov 18, 2022 at 3:12 AM Enrico Olivelli 
> wrote:
> > >
> > > +1
> > >
> > > Enrico
> > >
> > > Il Gio 17 Nov 2022, 18:40 Matteo Merli  ha
> scritto:
> > >
> > > > +1 Good suggestion
> > > >
> > > >
> > > > --
> > > > Matteo Merli
> > > > 
> > > >
> > > > On Thu, Nov 17, 2022 at 7:32 AM tison  wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > When squashing issue backlog, I noticed there're a number of
> "legacy"
> > > > tags
> > > > > in series "traige/week-x".
> > > > >
> > > > > It seems the initiative has been halted and those labels don't
> provide
> > > > > continuous value. Thus I suggest we remove "triage" series labels
> to
> > > > reduce
> > > > > engineering debt.
> > > > >
> > > > > What do you think?
> > > > >
> > > > > Best,
> > > > > tison.
> > > >
>


[GitHub] [pulsar] raunakagrawal47 edited a discussion: Not able to run pulsar function locally

2022-11-20 Thread GitBox


GitHub user raunakagrawal47 edited a discussion: Not able to run pulsar 
function locally

I have tried everything to package and run my python function locally. 
I tried to run the pulsar standalone on my laptop (v. 2.9.1) as well as tried 
to run pulsar in docker (latest image).

Here is the repo: https://github.com/raunakagrawal47/pulsar-test

I am trying to run a basic python function with some dependencies attached to 
it.
Steps I followed to run pulsar locally:
Generate whl file for all required dependencies: `pip3 download --only-binary 
:all: -r requirements.txt -d deps`
Zip the contents of folder: `zip -r format-phone-number.zip . -x test/**\* -x 
venv/**\* -x .idea/**\*`
Run the pulsar function: `bin/pulsar-admin functions localrun --tenant public 
--namespace default --py format-phone-number.zip --classname TestEtl.TestEtl 
--inputs persistent://public/default/in --output 
persistent://public/default/out`

I tried to run pulsar function locally on laptop by running pulsar standalone 
as well as tried to run by copying the zip file to a pulsar docker image using:

```
docker cp format-phone-number.zip bbcba9a3f42b:/pulsar
docker exec -it bbcba9a3f42b /bin/bash
```

When running, I am getting below error:
```
2022-11-20T14:58:51,194+ [main] INFO 
org.apache.pulsar.functions.runtime.process.ProcessRuntimeFactory - Java 
instance jar location is not defined, using the location defined in system 
environment : /pulsar/instances/java-instance.jar
2022-11-20T14:58:51,197+ [main] INFO 
org.apache.pulsar.functions.runtime.process.ProcessRuntimeFactory - Python 
instance file location is not defined using the location defined in system 
environment : /pulsar/instances/python-instance/python_instance_main.py
2022-11-20T14:58:51,197+ [main] INFO 
org.apache.pulsar.functions.runtime.process.ProcessRuntimeFactory - No extra 
dependencies location is defined in either function worker config or system 
environment
2022-11-20T14:58:51,265+ [main] INFO 
org.apache.pulsar.functions.runtime.RuntimeSpawner - public/default/TestEtl-0 
RuntimeSpawner starting function
2022-11-20T14:58:51,266+ [main] INFO 
org.apache.pulsar.functions.runtime.process.ProcessRuntime - Creating function 
log directory /pulsar/logs/functions/public/default/TestEtl
2022-11-20T14:58:51,267+ [main] INFO 
org.apache.pulsar.functions.runtime.process.ProcessRuntime - Created or found 
function log directory /pulsar/logs/functions/public/default/TestEtl
2022-11-20T14:58:51,268+ [main] INFO 
org.apache.pulsar.functions.runtime.process.ProcessRuntime - ProcessBuilder 
starting the process with args python 
/pulsar/instances/python-instance/python_instance_main.py --py 
format-phone-number.zip --logging_directory /pulsar/logs/functions 
--logging_file TestEtl --logging_config_file 
/pulsar/conf/functions-logging/logging_config.ini --instance_id 0 --function_id 
44d3a24d-a564-45ed-a817-9b55d3212351 --function_version 
98cd871d-14ee-40d6-8272-fdc411b834d3 --function_details 
'{"tenant":"public","namespace":"default","name":"TestEtl","className":"TestEtl.TestEtl","runtime":"PYTHON","autoAck":true,"parallelism":1,"source":{"inputSpecs":{"persistent://public/default/in":{}},"cleanupSubscription":true},"sink":{"topic":"persistent://public/default/out","forwardSourceMessageProperty":true},"resources":{"cpu":1.0,"ram":"1073741824","disk":"10737418240"},"componentType":"FUNCTION"}'
 --pulsar_serviceurl pulsar://localhost:6650 --use_tls false --tls_al
 low_insecure false --hostname_verification_enabled false --max_buffered_tuples 
1024 --port 33167 --metrics_port 42809 --expected_healthcheck_interval 30 
--secrets_provider secretsprovider.ClearTextSecretsProvider --cluster_name local
2022-11-20T14:58:51,280+ [main] INFO 
org.apache.pulsar.functions.runtime.process.ProcessRuntime - Started process 
successfully
2022-11-20 14:58:51.526 INFO [140611343988544] Client:88 | Subscribing on Topic 
:persistent://public/default/in
2022-11-20 14:58:51.526 INFO [140611343988544] ClientConnection:189 | [ 
-> pulsar://localhost:6650] Create ClientConnection, timeout=1
2022-11-20 14:58:51.526 INFO [140611343988544] ConnectionPool:96 | Created 
connection for pulsar://localhost:6650
2022-11-20 14:58:51.526 INFO [140611281073920] ExecutorService:41 | Run 
io_service in a single thread
2022-11-20 14:58:51.527 INFO [140611281073920] ClientConnection:375 | 
[127.0.0.1:59164 -> 127.0.0.1:6650] Connected to broker
2022-11-20 14:58:51.535 INFO [140611281073920] HandlerBase:64 | 
[persistent://public/default/in, public/default/TestEtl, 0] Getting connection 
from pool
2022-11-20 14:58:51.535 INFO [140611055253248] ExecutorService:41 | Run 
io_service in a single thread
2022-11-20 14:58:51.540 INFO  [140611281073920] ConsumerImpl:224 | 
[persistent://public/default/in, public/default/TestEtl, 0] Created consumer on 
broker [127.0.0.1:59164 -> 127.0.0.1:6650] 
2022-11-20T14:59:21,662+ [function-timer-thread-1-1] ERROR 

[GitHub] [pulsar] raunakagrawal47 edited a discussion: Not able to run pulsar function locally

2022-11-20 Thread GitBox


GitHub user raunakagrawal47 edited a discussion: Not able to run pulsar 
function locally

I have tried everything to package and run my python function locally. 
I tried to run the pulsar standalone on my laptop (v. 2.9.1) as well as tried 
to run pulsar in docker (latest image).

Here is the repo: https://github.com/raunakagrawal47/pulsar-test

I am trying to run a basic python function with some dependencies attached to 
it.
Steps I followed to run pulsar locally:
Generate whl file for all required dependencies: `pip3 download --only-binary 
:all: -r requirements.txt -d deps`
Zip the contents of folder: `zip -r format-phone-number.zip . -x test/**\* -x 
venv/**\* -x .idea/**\*`
Run the pulsar function: `bin/pulsar-admin functions localrun --tenant public 
--namespace default --py format-phone-number.zip --classname TestEtl.TestEtl 
--inputs persistent://public/default/in --output 
persistent://public/default/out`

I tried to run pulsar function locally on laptop by running pulsar standalone 
as well as tried to run by copying the zip file to a pulsar docker image using:

```
docker cp format-phone-number.zip bbcba9a3f42b:/pulsar
docker exec -it bbcba9a3f42b /bin/bash
```

When running, I am getting below error:
```
2022-11-20T14:58:51,194+ [main] INFO 
org.apache.pulsar.functions.runtime.process.ProcessRuntimeFactory - Java 
instance jar location is not defined, using the location defined in system 
environment : /pulsar/instances/java-instance.jar
2022-11-20T14:58:51,197+ [main] INFO 
org.apache.pulsar.functions.runtime.process.ProcessRuntimeFactory - Python 
instance file location is not defined using the location defined in system 
environment : /pulsar/instances/python-instance/python_instance_main.py
2022-11-20T14:58:51,197+ [main] INFO 
org.apache.pulsar.functions.runtime.process.ProcessRuntimeFactory - No extra 
dependencies location is defined in either function worker config or system 
environment
2022-11-20T14:58:51,265+ [main] INFO 
org.apache.pulsar.functions.runtime.RuntimeSpawner - public/default/TestEtl-0 
RuntimeSpawner starting function
2022-11-20T14:58:51,266+ [main] INFO 
org.apache.pulsar.functions.runtime.process.ProcessRuntime - Creating function 
log directory /pulsar/logs/functions/public/default/TestEtl
2022-11-20T14:58:51,267+ [main] INFO 
org.apache.pulsar.functions.runtime.process.ProcessRuntime - Created or found 
function log directory /pulsar/logs/functions/public/default/TestEtl
2022-11-20T14:58:51,268+ [main] INFO 
org.apache.pulsar.functions.runtime.process.ProcessRuntime - ProcessBuilder 
starting the process with args python 
/pulsar/instances/python-instance/python_instance_main.py --py 
format-phone-number.zip --logging_directory /pulsar/logs/functions 
--logging_file TestEtl --logging_config_file 
/pulsar/conf/functions-logging/logging_config.ini --instance_id 0 --function_id 
44d3a24d-a564-45ed-a817-9b55d3212351 --function_version 
98cd871d-14ee-40d6-8272-fdc411b834d3 --function_details 
'{"tenant":"public","namespace":"default","name":"TestEtl","className":"TestEtl.TestEtl","runtime":"PYTHON","autoAck":true,"parallelism":1,"source":{"inputSpecs":{"persistent://public/default/in":{}},"cleanupSubscription":true},"sink":{"topic":"persistent://public/default/out","forwardSourceMessageProperty":true},"resources":{"cpu":1.0,"ram":"1073741824","disk":"10737418240"},"componentType":"FUNCTION"}'
 --pulsar_serviceurl pulsar://localhost:6650 --use_tls false --tls_al
 low_insecure false --hostname_verification_enabled false --max_buffered_tuples 
1024 --port 33167 --metrics_port 42809 --expected_healthcheck_interval 30 
--secrets_provider secretsprovider.ClearTextSecretsProvider --cluster_name local
2022-11-20T14:58:51,280+ [main] INFO 
org.apache.pulsar.functions.runtime.process.ProcessRuntime - Started process 
successfully
2022-11-20 14:58:51.526 INFO [140611343988544] Client:88 | Subscribing on Topic 
:persistent://public/default/in
2022-11-20 14:58:51.526 INFO [140611343988544] ClientConnection:189 | [ 
-> pulsar://localhost:6650] Create ClientConnection, timeout=1
2022-11-20 14:58:51.526 INFO [140611343988544] ConnectionPool:96 | Created 
connection for pulsar://localhost:6650
2022-11-20 14:58:51.526 INFO [140611281073920] ExecutorService:41 | Run 
io_service in a single thread
2022-11-20 14:58:51.527 INFO [140611281073920] ClientConnection:375 | 
[127.0.0.1:59164 -> 127.0.0.1:6650] Connected to broker
2022-11-20 14:58:51.535 INFO [140611281073920] HandlerBase:64 | 
[persistent://public/default/in, public/default/TestEtl, 0] Getting connection 
from pool
2022-11-20 14:58:51.535 INFO [140611055253248] ExecutorService:41 | Run 
io_service in a single thread
2022-11-20 14:58:51.540 INFO  [140611281073920] ConsumerImpl:224 | 
[persistent://public/default/in, public/default/TestEtl, 0] Created consumer on 
broker [127.0.0.1:59164 -> 127.0.0.1:6650] 
2022-11-20T14:59:21,662+ [function-timer-thread-1-1] ERROR