Re: [VOTE] PIP-339: Introducing the --log-topic Option for Pulsar Sinks and Sources

2024-02-27 Thread Pengcheng Jiang
@Jiwei Guo  would you please help to merge the PR?
many thanks

Regards,
Pengcheng Jiang

Pengcheng Jiang  于2024年2月28日周三 09:02写道:

> Close this vote with 4 +1(binding) and 2 +1(non-binding).
>
> mattison chao  于2024年2月27日周二 19:14写道:
>
>> +1 (binding)
>>
>> Best,
>> Mattison
>> On Feb 27, 2024 at 16:04 +0800, Baodi Shi , wrote:
>> > +1(non-binding)
>> >
>> > Thanks,
>> > Baodi Shi
>> >
>> >
>> > On Feb 27, 2024 at 15:57:07, Hang Chen  wrote:
>> >
>> > > +1(binding)
>> > >
>> > > Regards,
>> > > Hang
>> > >
>> > > guo jiwei  于2024年2月27日周二 15:54写道:
>> > >
>> > >
>> > > +1 (binding)
>> > >
>> > >
>> > > Regards
>> > >
>> > > Jiwei Guo (Tboy)
>> > >
>> > >
>> > >
>> > > On Tue, Feb 27, 2024 at 10:18 AM Zike Yang  wrote:
>> > >
>> > >
>> > > > > +1 (no-binding)
>> > >
>> > > > >
>> > >
>> > > > > BR,
>> > >
>> > > > > Zike Yang
>> > >
>> > > > >
>> > >
>> > > > > On Tue, Feb 27, 2024 at 8:56 AM PengHui Li 
>> wrote:
>> > >
>> > > > > > >
>> > >
>> > > > > > > +1 (binding)
>> > >
>> > > > > > >
>> > >
>> > > > > > > Regards,
>> > >
>> > > > > > > Penghui
>> > >
>> > > > > > >
>> > >
>> > > > > > > On Mon, Feb 26, 2024 at 5:44 PM Pengcheng Jiang
>> > >
>> > > > > > >  wrote:
>> > >
>> > > > > > >
>> > >
>> > > > > > > > > Hi, community
>> > >
>> > > > > > > > >
>> > >
>> > > > > > > > > I'm starting the vote for PIP-339: Introducing the
>> --log-topic Option
>> > >
>> > > > > for
>> > >
>> > > > > > > > > Pulsar Sinks and Sources
>> > >
>> > > > > > > > > PIP link: https://github.com/apache/pulsar/pull/22071
>> > >
>> > > > > > > > >
>> > >
>> > > > > > > > > Thanks,
>> > >
>> > > > > > > > > Pengcheng Jiang
>> > >
>> > > > > > > > >
>> > >
>> > > > >
>> > >
>> > >
>>
>


Re: [VOTE] PIP-339: Introducing the --log-topic Option for Pulsar Sinks and Sources

2024-02-27 Thread Pengcheng Jiang
Close this vote with 4 +1(binding) and 2 +1(non-binding).

mattison chao  于2024年2月27日周二 19:14写道:

> +1 (binding)
>
> Best,
> Mattison
> On Feb 27, 2024 at 16:04 +0800, Baodi Shi , wrote:
> > +1(non-binding)
> >
> > Thanks,
> > Baodi Shi
> >
> >
> > On Feb 27, 2024 at 15:57:07, Hang Chen  wrote:
> >
> > > +1(binding)
> > >
> > > Regards,
> > > Hang
> > >
> > > guo jiwei  于2024年2月27日周二 15:54写道:
> > >
> > >
> > > +1 (binding)
> > >
> > >
> > > Regards
> > >
> > > Jiwei Guo (Tboy)
> > >
> > >
> > >
> > > On Tue, Feb 27, 2024 at 10:18 AM Zike Yang  wrote:
> > >
> > >
> > > > > +1 (no-binding)
> > >
> > > > >
> > >
> > > > > BR,
> > >
> > > > > Zike Yang
> > >
> > > > >
> > >
> > > > > On Tue, Feb 27, 2024 at 8:56 AM PengHui Li 
> wrote:
> > >
> > > > > > >
> > >
> > > > > > > +1 (binding)
> > >
> > > > > > >
> > >
> > > > > > > Regards,
> > >
> > > > > > > Penghui
> > >
> > > > > > >
> > >
> > > > > > > On Mon, Feb 26, 2024 at 5:44 PM Pengcheng Jiang
> > >
> > > > > > >  wrote:
> > >
> > > > > > >
> > >
> > > > > > > > > Hi, community
> > >
> > > > > > > > >
> > >
> > > > > > > > > I'm starting the vote for PIP-339: Introducing the
> --log-topic Option
> > >
> > > > > for
> > >
> > > > > > > > > Pulsar Sinks and Sources
> > >
> > > > > > > > > PIP link: https://github.com/apache/pulsar/pull/22071
> > >
> > > > > > > > >
> > >
> > > > > > > > > Thanks,
> > >
> > > > > > > > > Pengcheng Jiang
> > >
> > > > > > > > >
> > >
> > > > >
> > >
> > >
>


Re: [DISCUSS] PIP-338: Introducing the --log-topic Option for Pulsar Sinks and Sources

2024-02-26 Thread Pengcheng Jiang
Oh, sorry, I missed the reply before sending the vote mail

For the authorization, I think users are responsible for doing that, just
like what they do while setting the log topic for functions.

And it's easy to distinguish logs if one log topic is used for multiple
functions because the producer will add `fqn` to the property while sending
log entry to pulsar in here
<https://github.com/apache/pulsar/blob/82237d3684fe506bcb6426b3b23f413422e6e4fb/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/LogAppender.java#L60-L67>
:

```java
public void append(LogEvent logEvent) {
producer.newMessage()

.value(logEvent.getMessage().getFormattedMessage().getBytes(StandardCharsets.UTF_8))
.property(LOG_LEVEL, logEvent.getLevel().name())
.property(INSTANCE, instance)
.property(FQN, fqn)
.sendAsync();
}
```



Regards,
Pengcheng Jiang

PengHui Li  于2024年2月26日周一 09:48写道:

> Do we need to take the authorization into account?
> Users might need to take one more step to grant permission to access the
> log topic.
> It is better to mention it in the proposal.
>
> If users try to use one log topic for multiple functions,
> Is it easy for them to distinguish which function the logs are from?
>
> Regards,
> Penghui
>
> On Tue, Feb 20, 2024 at 9:03 AM Pengcheng Jiang
>  wrote:
>
> > Sorry, the PIP number should be PIP-339
> >
> > Pengcheng Jiang  于2024年2月19日周一 17:30写道:
> >
> > > Dear community,
> > >
> > > I created a PIP to support `--log-topic` for Pulsar Sinks and Sources:
> > > https://github.com/apache/pulsar/pull/22071
> > >
> > > It will make Pulsar Functions and Connectors have the same way to
> manage
> > > their logs.
> > >
> > > Any feedback and suggestions are welcome.
> > >
> > > Sincerely
> > > Pengcheng Jiang
> > >
> >
>


[VOTE] PIP-339: Introducing the --log-topic Option for Pulsar Sinks and Sources

2024-02-26 Thread Pengcheng Jiang
Hi, community

I'm starting the vote for PIP-339: Introducing the --log-topic Option for
Pulsar Sinks and Sources
PIP link: https://github.com/apache/pulsar/pull/22071

Thanks,
Pengcheng Jiang


Re: [DISCUSS] PIP-338: Introducing the --log-topic Option for Pulsar Sinks and Sources

2024-02-19 Thread Pengcheng Jiang
Sorry, the PIP number should be PIP-339

Pengcheng Jiang  于2024年2月19日周一 17:30写道:

> Dear community,
>
> I created a PIP to support `--log-topic` for Pulsar Sinks and Sources:
> https://github.com/apache/pulsar/pull/22071
>
> It will make Pulsar Functions and Connectors have the same way to manage
> their logs.
>
> Any feedback and suggestions are welcome.
>
> Sincerely
> Pengcheng Jiang
>


[DISCUSS] PIP-338: Introducing the --log-topic Option for Pulsar Sinks and Sources

2024-02-19 Thread Pengcheng Jiang
Dear community,

I created a PIP to support `--log-topic` for Pulsar Sinks and Sources:
https://github.com/apache/pulsar/pull/22071

It will make Pulsar Functions and Connectors have the same way to manage
their logs.

Any feedback and suggestions are welcome.

Sincerely
Pengcheng Jiang


Re: [VERIFY] Pulsar Release 3.2.0 Candidate 2

2024-01-18 Thread Pengcheng Jiang
> 1. The querystate exit with `Reason: key 'hello' doesn't exist`, which the
old version will not exit

there is a bug for it, a 500 error is returned instead of a 404 error,
there is the fix: https://github.com/apache/pulsar/pull/21921

> The output missed the `version` field

In PIP-312 , I used the
`StateStore` instead of bookkeeper's `StorageAdminClient` to manage state
keys in `ComponentImpl`, this is to make functions support
`StateStoreImpls` other than the bookkeeper
but for now the `StateStore.get` method doesn't return `version` yet, so
there is no `version` field in the output, we may expose the `version`
field to the `StateStore.get` in the future

Regards,
Jiang Pengcheng

PengHui Li  于2024年1月18日周四 11:29写道:

> It looks like something broke the behavior of the querystate from Pulsar
> Functions.
>
> 1. The querystate exit with `Reason: key 'hello' doesn't exist`, which the
> old version will not exit
>
> ```
> lipenghui@lipenghuis-MacBook-Pro apache-pulsar-3.2.0 % bin/pulsar-admin
> functions querystate --tenant test --namespace test-namespace --name
> word_count -k hello -w
> key 'hello' doesn't exist.
>
> Reason: key 'hello' doesn't exist.
> ```
>
> 2. The output missed the `version` field
>
> ```
>   "key": "hello",
>   "numberValue": 20,
>   "version": 19
> ```
>
> ```
>   "key": "hello",
>   "stringValue": "\u\u\u\u\u\u\u\n",
>   "numberValue": 10
> ```
>
> Regards,
> Penghui
>
> On Tue, Jan 16, 2024 at 6:34 PM Zixuan Liu  wrote:
>
> > +1 (non-binding)
> >
> > - Checked the checksums and signatures
> > - Built with Temurin-17.0.6+10
> > - Run standalone
> > - Checked producer and consumer
> >
> > Thanks,
> > Zixuan
> >
> > guo jiwei  于2024年1月16日周二 15:57写道:
> >
> > > This is the second release candidate for Apache Pulsar version 3.2.0.
> > >
> > > It fixes the following issues:
> > > https://github.com/apache/pulsar/milestone/36?closed=1
> > >
> > > *** Please download, test and verify on this release. This release
> > > candidate verification will stay open until Jan 15 ***
> > >
> > > Note that we are verifying upon the source (tag), binaries are provided
> > for
> > > convenience.
> > >
> > > Source and binary files:
> > >
> https://dist.apache.org/repos/dist/dev/pulsar/pulsar-3.2.0-candidate-2/
> > >
> > > SHA-512 checksums:
> > >
> > >
> > >
> >
> 57af4de0531baada79ff3eb5e51659ae3d2b6732840a27861636d3538d8cca2b3aeb77d3f5cecf37dc5d8a0de11a05d0775ae2d2ca9398f9747fbe662e1cdeae
> > >
> > > apache-pulsar-3.2.0-bin.tar.gz
> > >
> > >
> > >
> >
> 7ddd8df261f4ffeed2e4f029aa475f523bb9505869870db681a57b7df25af01d5670ba00bc7e062ddae0b962eaed9ee828a4cd7f4744a965afd77e9ae1239d6a
> > >
> > > apache-pulsar-3.2.0-src.tar.gz
> > >
> > > Maven staging repo:
> > >
> https://repository.apache.org/content/repositories/orgapachepulsar-1262/
> > >
> > > The tag to verify:
> > > v3.2.0-candidate-2 (5348e1b9124052a454f66769ae3e9f54ee0a75d4)
> > > https://github.com/apache/pulsar/commits/v3.2.0-candidate-2/
> > >
> > > Pulsar's KEYS file containing PGP keys you use to sign the release:
> > > https://dist.apache.org/repos/dist/dev/pulsar/KEYS
> > >
> > > Docker images:
> > >
> > > pulsar images:
> > >
> > >
> >
> https://hub.docker.com/layers/technoboy8/pulsar/3.2.0-5348e1b/images/sha256-cf02a08e588ca493d10dbe472b72f5e64e4f90b8b5ae01e91321f336776a5a4e?context=repo
> > > <
> > >
> >
> https://hub.docker.com/layers/mattison/pulsar/3.1.0-candidate-1/images/sha256-0efbaad7d893cc5041a46a2d4d56432bda855ae4068a38349777d1be6e98d27d?context=explore
> > > >
> > > pulsar-all images:
> > >
> > >
> >
> https://hub.docker.com/layers/technoboy8/pulsar-all/3.2.0-5348e1b/images/sha256-801e395fc26257e2beb5ea0574fe27f5a4ff956c85cbcd6b362432d6dda28b95?context=repo
> > >
> > > Please download the source package, and follow the README to build
> > > and run the Pulsar standalone service.
> > >
> > > Note that this RC doesn't require a formal vote, but we would also
> > > appreciate your feedback with +1/-1. And please provide specific
> > > comments if your feedback is not +1.
> > >
> > >
> > > Regards
> > > Jiwei Guo (Tboy)
> > >
> >
>


Re: [VOTE] PIP-312 Use StateStoreProvider to manage state in Pulsar Functions endpoints

2023-11-19 Thread Pengcheng Jiang
typo, should be: Close this vote with 3 +1(binding) and 5 +1(non-binding).

Pengcheng Jiang  于2023年11月20日周一 09:13写道:

> Close this vote with 3 +1(binding) and 1 +5(non-binding).
>
> guo jiwei  于2023年11月20日周一 09:07写道:
>
>> +1 binding
>>
>>
>> Regards
>> Jiwei Guo (Tboy)
>>
>>
>> On Sat, Nov 18, 2023 at 12:27 AM 太上玄元道君  wrote:
>>
>> > +1 non binding
>> >
>> > Zili Chen 于2023年11月17日 周五17:56写道:
>> >
>> > > +1 binding
>> > >
>> > > Thanks for your proposal.
>> > >
>> > > On 2023/11/15 03:39:42 Pengcheng Jiang wrote:
>> > > > Hi Pulsar Community,
>> > > >
>> > > > This thread is to start a vote for PIP-312: Use StateStoreProvider
>> to
>> > > > manage state in Pulsar Functions endpoints.
>> > > >
>> > > > I start the voting process since there are some approves for the PIP
>> > PR.
>> > > >
>> > > > PR: https://github.com/apache/pulsar/pull/21438
>> > > > Discussion thread:
>> > > > https://lists.apache.org/thread/0rz29wotonmdck76pdscwbqo19t3rbds
>> > > >
>> > > > Sincerely,
>> > > > Pengcheng Jiang
>> > > >
>> > >
>> >
>>
>


Re: [VOTE] PIP-312 Use StateStoreProvider to manage state in Pulsar Functions endpoints

2023-11-19 Thread Pengcheng Jiang
@Jiwei Guo  would you please help to merge the PR?
many thanks

Pengcheng Jiang  于2023年11月20日周一 09:13写道:

> Close this vote with 3 +1(binding) and 1 +5(non-binding).
>
> guo jiwei  于2023年11月20日周一 09:07写道:
>
>> +1 binding
>>
>>
>> Regards
>> Jiwei Guo (Tboy)
>>
>>
>> On Sat, Nov 18, 2023 at 12:27 AM 太上玄元道君  wrote:
>>
>> > +1 non binding
>> >
>> > Zili Chen 于2023年11月17日 周五17:56写道:
>> >
>> > > +1 binding
>> > >
>> > > Thanks for your proposal.
>> > >
>> > > On 2023/11/15 03:39:42 Pengcheng Jiang wrote:
>> > > > Hi Pulsar Community,
>> > > >
>> > > > This thread is to start a vote for PIP-312: Use StateStoreProvider
>> to
>> > > > manage state in Pulsar Functions endpoints.
>> > > >
>> > > > I start the voting process since there are some approves for the PIP
>> > PR.
>> > > >
>> > > > PR: https://github.com/apache/pulsar/pull/21438
>> > > > Discussion thread:
>> > > > https://lists.apache.org/thread/0rz29wotonmdck76pdscwbqo19t3rbds
>> > > >
>> > > > Sincerely,
>> > > > Pengcheng Jiang
>> > > >
>> > >
>> >
>>
>


Re: [VOTE] PIP-312 Use StateStoreProvider to manage state in Pulsar Functions endpoints

2023-11-19 Thread Pengcheng Jiang
Close this vote with 3 +1(binding) and 1 +5(non-binding).

guo jiwei  于2023年11月20日周一 09:07写道:

> +1 binding
>
>
> Regards
> Jiwei Guo (Tboy)
>
>
> On Sat, Nov 18, 2023 at 12:27 AM 太上玄元道君  wrote:
>
> > +1 non binding
> >
> > Zili Chen 于2023年11月17日 周五17:56写道:
> >
> > > +1 binding
> > >
> > > Thanks for your proposal.
> > >
> > > On 2023/11/15 03:39:42 Pengcheng Jiang wrote:
> > > > Hi Pulsar Community,
> > > >
> > > > This thread is to start a vote for PIP-312: Use StateStoreProvider to
> > > > manage state in Pulsar Functions endpoints.
> > > >
> > > > I start the voting process since there are some approves for the PIP
> > PR.
> > > >
> > > > PR: https://github.com/apache/pulsar/pull/21438
> > > > Discussion thread:
> > > > https://lists.apache.org/thread/0rz29wotonmdck76pdscwbqo19t3rbds
> > > >
> > > > Sincerely,
> > > > Pengcheng Jiang
> > > >
> > >
> >
>


[VOTE] PIP-312 Use StateStoreProvider to manage state in Pulsar Functions endpoints

2023-11-14 Thread Pengcheng Jiang
Hi Pulsar Community,

This thread is to start a vote for PIP-312: Use StateStoreProvider to
manage state in Pulsar Functions endpoints.

I start the voting process since there are some approves for the PIP PR.

PR: https://github.com/apache/pulsar/pull/21438
Discussion thread:
https://lists.apache.org/thread/0rz29wotonmdck76pdscwbqo19t3rbds

Sincerely,
Pengcheng Jiang


[DISCUSS] PIP-312 Use StateStoreProvider to manage state in Pulsar Functions endpoints

2023-10-24 Thread Pengcheng Jiang
Dear community,

I created a PIP to use `StateStoreProvider` instead of Apache BookKeeper's
`StorageAdminClient` in Pulsar Functions' putstate and querystate endpoints
to manage functions' state store:
https://github.com/apache/pulsar/pull/21438

It will decouple the function's state store from BookKeeper and enable it
to use other state store backends.

Any feedback and suggestions are welcome.

Sincerely
Pengcheng Jiang


[VOTE] PIP-272

2023-06-05 Thread Pengcheng Jiang
Hello, community:

This thread is to start a vote for PIP-272: Add stateStorageConfig to
WorkerConfig.

Discussion thread:
https://lists.apache.org/thread/pwfv7nj64frfnbw7jfydzx8my15b3lj6
PR: https://github.com/apache/pulsar/pull/20455

Sincerely
Pengcheng Jiang


Re: [DISCUSS] PIP-272 Add a `StateStoreConfig` to the `WorkerConfig`

2023-06-01 Thread Pengcheng Jiang
pr is created: https://github.com/apache/pulsar/pull/20455

Pengcheng Jiang  于2023年6月1日周四 15:49写道:

> Sure, I will create a PR and update the issue when development is done
>
> Asaf Mesika  于2023年5月31日周三 16:39写道:
>
>> Pengcheng, would you be willing to be the inaugural PIP in our PIP
>> submission process?
>> Yesterday, we officially moved from the GitHub issue to a markdown file
>> for
>> PIP submissions.
>>
>> For you, it basically means moving your proposal to a markdown file and
>> submitting a PR (and deleting the content in the github issue, just
>> placing
>> a link. Next time no need to open github issue)
>>
>> The process is described step by step here:
>> https://github.com/apache/pulsar/blob/master/pip/README.md
>>
>> Thanks!
>>
>> Asaf
>>
>>
>> On Wed, May 31, 2023 at 12:55 AM Neng Lu  wrote:
>>
>> > thanks for the improvements, +1
>> >
>> > On Tue, May 30, 2023 at 2:20 AM Pengcheng Jiang
>> >  wrote:
>> >
>> > > Hi Mesika:
>> > >
>> > > Thanks for the suggestions, I updated the pip, and for the rest
>> > questions:
>> > >
>> > > 5. yes, all config goes through arguments instead of a file
>> > > 6. it should be a JSON string that can be deserialized to a
>> `Map> > > Object>`, updated in pip
>> > > 7. it should be `pulsar-admin functions localrun` command, updated in
>> pip
>> > > 8. the `stateStorageServiceUrl` won't be touched
>> > >
>> > > Sincerely
>> > > Pengcheng Jiang
>> > >
>> > > Asaf Mesika  于2023年5月29日周一 19:53写道:
>> > >
>> > > > Hi Pengcheng,
>> > > >
>> > > > Looks like a solid improvement, definitely helping people using
>> their
>> > own
>> > > > state store.
>> > > >
>> > > > I have a few comments:
>> > > >
>> > > > 1. Background knowledge should explain what is a state storage
>> > > > 2. Move problem description from Background Knowledge to Motivation.
>> > > >
>> > > > I'm quoting the template to understand what should be included in
>> > > > the Background knowledge section:
>> > > >
>> > > > 
>> > > >
>> > > > 3. `WorkerConfig` - explain briefly what is Worker and how it
>> differs
>> > > from
>> > > > Broker. Should be in background knowledge section.
>> > > >
>> > > > 4. Background knowledge should explain briefly what is a runtime and
>> > > > runtime factory.
>> > > >
>> > > > 5.
>> > > >
>> > > > Add a new cli argument to JavaInstanceStarter and LocalRunner so
>> > > > > process runtime can pass state related config to them
>> > > >
>> > > >
>> > > > Today all config goes through arguments and not a file?
>> > > >
>> > > > 6. `--stateStorageConfig`
>> > > >   What format is the expected value?
>> > > >
>> > > > 7. `functions local run`
>> > > >  What is this?
>> > > >
>> > > > 8. Are you keeping `stateStorageServiceUrl`? Maybe people rely on
>> it?
>> > > >
>> > > > 9. Don't forget to include link to discussion thread using Apache
>> Pony
>> > > Mail
>> > > >
>> > > >
>> > > > On Mon, May 29, 2023 at 10:44 AM Rui Fu  wrote:
>> > > >
>> > > > > Hi Pengcheng,
>> > > > >
>> > > > > Thanks for bringing this up, the PIP lgtm, +1.
>> > > > >
>> > > > > Best,
>> > > > >
>> > > > > Rui Fu
>> > > > > On May 29, 2023 at 13:52 +0800, Enrico Olivelli <
>> eolive...@gmail.com
>> > >,
>> > > > > wrote:
>> > > > > > Looks good
>> > > > > > +1
>> > > > > >
>> > > > > > Enrico
>> > > > > >
>> > > > > > Il Lun 29 Mag 2023, 04:47 Pengcheng Jiang
>> > > > > >  ha scritto:
>> > > > > >
>> > > > > > > Dear Pulsar community,
>> > > > > > >
>> > > > > > > I created a pip to make pulsar functions' `StateStoreProvider`
>> > > > > configurable
>> > > > > > > with custom configurations:
>> > > > > https://github.com/apache/pulsar/issues/20419
>> > > > > > >
>> > > > > > > Any feedback and suggestions are welcome
>> > > > > > >
>> > > > > > > Sincerely
>> > > > > > > Pengcheng Jiang
>> > > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>


Re: [DISCUSS] PIP-272 Add a `StateStoreConfig` to the `WorkerConfig`

2023-06-01 Thread Pengcheng Jiang
Sure, I will create a PR and update the issue when development is done

Asaf Mesika  于2023年5月31日周三 16:39写道:

> Pengcheng, would you be willing to be the inaugural PIP in our PIP
> submission process?
> Yesterday, we officially moved from the GitHub issue to a markdown file for
> PIP submissions.
>
> For you, it basically means moving your proposal to a markdown file and
> submitting a PR (and deleting the content in the github issue, just placing
> a link. Next time no need to open github issue)
>
> The process is described step by step here:
> https://github.com/apache/pulsar/blob/master/pip/README.md
>
> Thanks!
>
> Asaf
>
>
> On Wed, May 31, 2023 at 12:55 AM Neng Lu  wrote:
>
> > thanks for the improvements, +1
> >
> > On Tue, May 30, 2023 at 2:20 AM Pengcheng Jiang
> >  wrote:
> >
> > > Hi Mesika:
> > >
> > > Thanks for the suggestions, I updated the pip, and for the rest
> > questions:
> > >
> > > 5. yes, all config goes through arguments instead of a file
> > > 6. it should be a JSON string that can be deserialized to a
> `Map > > Object>`, updated in pip
> > > 7. it should be `pulsar-admin functions localrun` command, updated in
> pip
> > > 8. the `stateStorageServiceUrl` won't be touched
> > >
> > > Sincerely
> > > Pengcheng Jiang
> > >
> > > Asaf Mesika  于2023年5月29日周一 19:53写道:
> > >
> > > > Hi Pengcheng,
> > > >
> > > > Looks like a solid improvement, definitely helping people using their
> > own
> > > > state store.
> > > >
> > > > I have a few comments:
> > > >
> > > > 1. Background knowledge should explain what is a state storage
> > > > 2. Move problem description from Background Knowledge to Motivation.
> > > >
> > > > I'm quoting the template to understand what should be included in
> > > > the Background knowledge section:
> > > >
> > > > 
> > > >
> > > > 3. `WorkerConfig` - explain briefly what is Worker and how it differs
> > > from
> > > > Broker. Should be in background knowledge section.
> > > >
> > > > 4. Background knowledge should explain briefly what is a runtime and
> > > > runtime factory.
> > > >
> > > > 5.
> > > >
> > > > Add a new cli argument to JavaInstanceStarter and LocalRunner so
> > > > > process runtime can pass state related config to them
> > > >
> > > >
> > > > Today all config goes through arguments and not a file?
> > > >
> > > > 6. `--stateStorageConfig`
> > > >   What format is the expected value?
> > > >
> > > > 7. `functions local run`
> > > >  What is this?
> > > >
> > > > 8. Are you keeping `stateStorageServiceUrl`? Maybe people rely on it?
> > > >
> > > > 9. Don't forget to include link to discussion thread using Apache
> Pony
> > > Mail
> > > >
> > > >
> > > > On Mon, May 29, 2023 at 10:44 AM Rui Fu  wrote:
> > > >
> > > > > Hi Pengcheng,
> > > > >
> > > > > Thanks for bringing this up, the PIP lgtm, +1.
> > > > >
> > > > > Best,
> > > > >
> > > > > Rui Fu
> > > > > On May 29, 2023 at 13:52 +0800, Enrico Olivelli <
> eolive...@gmail.com
> > >,
> > > > > wrote:
> > > > > > Looks good
> > > > > > +1
> > > > > >
> > > > > > Enrico
> > > > > >
> > > > > > Il Lun 29 Mag 2023, 04:47 Pengcheng Jiang
> > > > > >  ha scritto:
> > > > > >
> > > > > > > Dear Pulsar community,
> > > > > > >
> > > > > > > I created a pip to make pulsar functions' `StateStoreProvider`
> > > > > configurable
> > > > > > > with custom configurations:
> > > > > https://github.com/apache/pulsar/issues/20419
> > > > > > >
> > > > > > > Any feedback and suggestions are welcome
> > > > > > >
> > > > > > > Sincerely
> > > > > > > Pengcheng Jiang
> > > > > > >
> > > > >
> > > >
> > >
> >
>


Re: [DISCUSS] PIP-272 Add a `StateStoreConfig` to the `WorkerConfig`

2023-05-30 Thread Pengcheng Jiang
Hi Mesika:

Thanks for the suggestions, I updated the pip, and for the rest questions:

5. yes, all config goes through arguments instead of a file
6. it should be a JSON string that can be deserialized to a `Map`, updated in pip
7. it should be `pulsar-admin functions localrun` command, updated in pip
8. the `stateStorageServiceUrl` won't be touched

Sincerely
Pengcheng Jiang

Asaf Mesika  于2023年5月29日周一 19:53写道:

> Hi Pengcheng,
>
> Looks like a solid improvement, definitely helping people using their own
> state store.
>
> I have a few comments:
>
> 1. Background knowledge should explain what is a state storage
> 2. Move problem description from Background Knowledge to Motivation.
>
> I'm quoting the template to understand what should be included in
> the Background knowledge section:
>
> 
>
> 3. `WorkerConfig` - explain briefly what is Worker and how it differs from
> Broker. Should be in background knowledge section.
>
> 4. Background knowledge should explain briefly what is a runtime and
> runtime factory.
>
> 5.
>
> Add a new cli argument to JavaInstanceStarter and LocalRunner so
> > process runtime can pass state related config to them
>
>
> Today all config goes through arguments and not a file?
>
> 6. `--stateStorageConfig`
>   What format is the expected value?
>
> 7. `functions local run`
>  What is this?
>
> 8. Are you keeping `stateStorageServiceUrl`? Maybe people rely on it?
>
> 9. Don't forget to include link to discussion thread using Apache Pony Mail
>
>
> On Mon, May 29, 2023 at 10:44 AM Rui Fu  wrote:
>
> > Hi Pengcheng,
> >
> > Thanks for bringing this up, the PIP lgtm, +1.
> >
> > Best,
> >
> > Rui Fu
> > On May 29, 2023 at 13:52 +0800, Enrico Olivelli ,
> > wrote:
> > > Looks good
> > > +1
> > >
> > > Enrico
> > >
> > > Il Lun 29 Mag 2023, 04:47 Pengcheng Jiang
> > >  ha scritto:
> > >
> > > > Dear Pulsar community,
> > > >
> > > > I created a pip to make pulsar functions' `StateStoreProvider`
> > configurable
> > > > with custom configurations:
> > https://github.com/apache/pulsar/issues/20419
> > > >
> > > > Any feedback and suggestions are welcome
> > > >
> > > > Sincerely
> > > > Pengcheng Jiang
> > > >
> >
>


[DISCUSS] PIP-272 Add a `StateStoreConfig` to the `WorkerConfig`

2023-05-28 Thread Pengcheng Jiang
Dear Pulsar community,

I created a pip to make pulsar functions' `StateStoreProvider` configurable
with custom configurations: https://github.com/apache/pulsar/issues/20419

Any feedback and suggestions are welcome

Sincerely
Pengcheng Jiang


Re: [DISCUSS] Improve Pulsar Function Source Primitive Schema Mapping

2023-04-27 Thread Pengcheng Jiang
Hello Neng,

IMO, we should update code[2] to follow the doc, and for existing
functions, if they are in running status, they won't touch code[2]; and for
a new run, functions
will fail to start, and this will remind users to update their function

Regards,
Pengcheng Jiang

Neng Lu  于2023年4月28日周五 06:59写道:

> Hi All,
>
> Based on [1], Pulsar has various primitive schema types and has a very
> clear mapping between java classes to primitive schema types.
>
> But in code [2], Pulsar Function Source only handles the byte and String
> java classes primitive schema mapping while default all other primitive
> types to JSON schema. Also for byte class types, the NONE schema is used
> instead of the BYTES schema.
>
> All these differences cause confusion for users trying to use Pulsar
> Functions for the first time, and also make Pulsar Function not following
> the Pulsar Schema official document.
>
> Ideally, we should change the code [2], to make it following [1]. But such
> changes may lead to breaking behaviors for existing users who adapted their
> code to run the Pulsar Functions.
>
> I would like to hear your thoughts on this and see how we should proceed.
>
> Thank you! Regards
>
> [1]
> https://pulsar.apache.org/docs/2.11.x/schema-understand/#primitive-type
> [2]
>
> https://github.com/apache/pulsar/blob/master/pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/source/TopicSchema.java#L124
>


Re: [DISCUSS] Separate the Python Functions from the installation of the Python client

2023-04-19 Thread Pengcheng Jiang
Hi Yunze,

+1 for separating

And for the migration progress:
1. since the `pulsar-functions` lib doesn't have any code but just a list
of dependencies, I think there is no need to setup a "functions" lib?
2. we should update docs about what dependencies are required for Python
functions
3. the Dockerfile of Pulsar image should be updated to install the required
python dependencies so it can start python function in process runtime mode
4. the python runner image's Dockerfile should be updated to install the
required python dependencies so it can run in k8s runtime mode

Regards,
Pengcheng Jiang


Yunze Xu  于2023年4月20日周四 10:46写道:

> Hi Neng,
>
> I think currently a simple solution is to document which dependencies
> should users install to use the Python Functions.
>
> Before:
>
> ```
> # For version < 3.2.0
> pip install pulsar-client[functions]
> ```
>
> Now:
>
> ```
> # For version >= 3.2.0
> pip install pulsar-client
> pip install grpc-io
> # Other dependencies...
> ```
>
> In future, we can provide a separate PyPI package like `pulsar-functions`.
>
> BTW, currently the functions extra dependencies cannot be installed
> for Python 3.10 and 3.11. I have tested the following images with the
> `pip install 'pulsar-client[functions]==3.1.0'` command.
> - python:3.11.2-bullseye
> - python:3.10.11-bullseye
>
> The reason is the version incompatibility of grpcio.
>
> > ERROR: Failed building wheel for grpcio
>
> Though I only tested 3.1.0, since the dependencies never changed, they
> should also fail for older Python client releases like 2.10.0.
>
> Thanks,
> Yunze
>
> On Thu, Apr 20, 2023 at 12:26 AM Neng Lu  wrote:
> >
> > Hi Yunze,
> >
> > +1 for separating Python client and Python Pulsar Functions pip
> installation.
> > On the Java side, the client lib and functions lib are also published
> separately.
> >
> > My concern is how the migration progress should look like,
> > 1. we need to set up functions lib so that users can install it using
> `pip install pulsar-functions`
> > 2. the current `pip install pulsar-client[functions]` should prompt user
> to use the new way
> > 3. all docs need to be updated
> > 4. for historical versions, what can we do?
> >
> >
> >
> > On 2023/04/19 15:23:49 Yunze Xu wrote:
> > > Hi all,
> > >
> > > The Python client has been separated since PIP-209 [1] and now the
> > > Python client is maintained in a separated repository [2]. However,
> > > the Python Function is still maintained in the main repo [3].
> > >
> > > Currently, we can install the Python client with the following ways:
> > > 1. pip install pulsar-client
> > > 2. pip install pulsar-client[avro]
> > > 3. pip install pulsar-client[functions]
> > > 4. pip install pulsar-client[all]
> > >
> > > See [4] for the difference. However, for the 3rd and 4th ways, it
> > > installs all the dependencies required by the Python Functions.
> > > However, they are broken for the recent releases because of the
> > > outdated dependencies [5]. However, these dependencies are from the
> > > Python Functions [3], not the Python client library itself. Also,
> > > there are no tests in the Python client repo [2] for these functions
> > > so these dependencies cannot be verified.
> > >
> > > IMO, these dependencies should be maintained in the directory of the
> > > Python Functions. We should not rely on the Python client to install
> > > the dependencies for the Python Functions.
> > >
> > > Therefore, my suggestion is to drop the 3rd and 4th installation
> > > methods in future releases of the Python client. After that, we should
> > > update the scripts in the main repo to install the Python Functions in
> > > [3].
> > >
> > > I'm not familiar with the Pulsar Functions, so feel free to show your
> > > suggestions if any of you have any concerns.
> > >
> > > [1] https://github.com/apache/pulsar/pull/17881
> > > [2] http://github.com/apache/pulsar-client-python
> > > [3]
> https://github.com/apache/pulsar/tree/master/pulsar-functions/instance/src/main/python
> > > [4] https://pulsar.apache.org/docs/2.11.x/client-libraries-python/
> > > [5]
> https://github.com/apache/pulsar-client-python/blob/a6476d9c45508f55a7af4b25001038a8e3a27489/setup.py#L80-L88
> > >
> > > Thanks,
> > > Yunze
> > >
>


-- 

<https://streamnative.io/>

Pengcheng Jiang

Software Engineer

e: pengcheng.ji...@streamnative.io

p: 13540631948

streamnative.io

<http://github.com/streamnative>
<https://www.linkedin.com/company/streamnative/>
<https://twitter.com/streamnativeio/>


Re: [DISCUSS] Support custom compressionType for pulsar functions

2023-02-28 Thread Pengcheng Jiang
Use `LZ4` as zero value seems better, I will update

Rui Fu  于2023年3月1日周三 11:12写道:

> +1, very useful, just one question: why not set the `LZ4` as the “zero”
> value instead? Like for the enum with following orders: `LZ4`, `NONE`,
> `ZLIB`, `ZSTD` and `SNAPPY`? So it will remain the backward compatibility.
>
> Best,
>
> Rui Fu
> On Mar 1, 2023 at 10:18 +0800, Neng Lu , wrote:
> > +1 for the change.
> >
> > On 2023/02/28 01:06:51 Pengcheng Jiang wrote:
> > > Hello, community:
> > >
> > > ### Motivation
> > >
> > > Currently, pulsar functions are using `LZ4` as the compression type,
> and
> > > users cannot change it, yet some users may want to custom this
> behavior.
> > >
> > > ### Modifications
> > >
> > > Add a `CompressionType` field(which is an enum) to the `ProducerSpec`
> in
> > > the `Function.proto`; this enum has six values: `NOTSET`, `NONE`,
> `LZ4`,
> > > `ZLIB`, `ZSTD` and `SNAPPY`, there is a `NOTSET` value besides of 5
> > > supported compression type, so that even users don't set the
> > > `CompressionType`, it will fallback to its "zero" value: `NOTSET`
> instead
> > > of `NONE`, and in such case, pulsar function instances will use `LZ4`
> to
> > > keep the same behavior with before.
> > >
> > > PTAL when you have time and feel free to leave any comments.
> > >
> > > Best Regards,
> > > Pengcheng Jiang
> > >
> > > [0] https://github.com/apache/pulsar/pull/19470
> > > --
> > >
> > > <https://streamnative.io/>
> > >
> > > Pengcheng Jiang
> > >
> > > Software Engineer
> > >
> > > e: pengcheng.ji...@streamnative.io
> > >
> > > p: 13540631948
> > >
> > > streamnative.io
> > >
> > > <http://github.com/streamnative>
> > > <https://www.linkedin.com/company/streamnative/>
> > > <https://twitter.com/streamnativeio/>
> > >
>


-- 

<https://streamnative.io/>

Pengcheng Jiang

Software Engineer

e: pengcheng.ji...@streamnative.io

p: 13540631948

streamnative.io

<http://github.com/streamnative>
<https://www.linkedin.com/company/streamnative/>
<https://twitter.com/streamnativeio/>


[DISCUSS] Support custom compressionType for pulsar functions

2023-02-27 Thread Pengcheng Jiang
Hello, community:

### Motivation

Currently, pulsar functions are using `LZ4` as the compression type, and
users cannot change it, yet some users may want to custom this behavior.

### Modifications

Add a `CompressionType` field(which is an enum) to the `ProducerSpec` in
the `Function.proto`; this enum has six values: `NOTSET`, `NONE`, `LZ4`,
`ZLIB`, `ZSTD` and `SNAPPY`, there is a `NOTSET` value besides of 5
supported compression type, so that even users don't set the
`CompressionType`, it will fallback to its "zero" value: `NOTSET` instead
of `NONE`, and in such case, pulsar function instances will use `LZ4` to
keep the same behavior with before.

PTAL when you have time and feel free to leave any comments.

Best Regards,
Pengcheng Jiang

[0] https://github.com/apache/pulsar/pull/19470
-- 

<https://streamnative.io/>

Pengcheng Jiang

Software Engineer

e: pengcheng.ji...@streamnative.io

p: 13540631948

streamnative.io

<http://github.com/streamnative>
<https://www.linkedin.com/company/streamnative/>
<https://twitter.com/streamnativeio/>