Hi Seth,

Thanks for your comment. I've seen that repository in the past and it was
really helpful to "validate" that this was the way to go.
I think my question is not being addressed there though: how could one add
dynamic behavior to your TransactionManager? In this case, state that is
available to all TransactionManager instances when they receive a message
of type Transaction for the first time.

Seth Wiesman <sjwies...@gmail.com> escreveu no dia terça, 23/02/2021 à(s)
16:02:

> Hey Miguel,
>
> What you are describing is exactly what is implemented in this repo. The
> TransactionManager function acts as an orchestrator to work with the other
> functions. The repo is structured as an exercise but the full solution
> exists on the branch `advanced-solution`.
>
> https://github.com/ververica/flink-statefun-workshop
>
> On Tue, Feb 23, 2021 at 8:34 AM Miguel Araújo <upwarr...@gmail.com> wrote:
>
>> Another possibility I am considering is handling this in Flink using a
>> broadcast and adding all the information needed to the event itself. I'm a
>> little concerned about the amount of data that will be serialized and sent
>> on every request though, as I'll need to include information about all
>> available remote functions, for instance.
>>
>> Miguel Araújo <upwarr...@gmail.com> escreveu no dia terça, 23/02/2021
>> à(s) 09:14:
>>
>>> Hi Gordon, Igal,
>>>
>>> Thanks for your replies.
>>> PubSub would be a good addition, I have a few scenarios where that would
>>> be useful.
>>>
>>> However, after reading your answers I realized that your proposed
>>> solutions (which address the most obvious interpretation of my question) do
>>> not necessarily solve my problem. I should have just stated what it was,
>>> instead of trying to propose a solution by discussing broadcast...
>>>
>>> I'm trying to implement an "orchestrator" function which, given an
>>> event, will trigger multiple remote function calls, aggregate their results
>>> and eventually call yet more functions (based on a provided dependency
>>> graph). Hence, this orchestrator function has state per event_id and each
>>> function instance is short-lived (a couple seconds at most, ideally
>>> sub-second). The question then is not about how to modify a long-running
>>> function instance (which PubSub would enable), but rather how to have the
>>> dependency graph available to new functions.
>>>
>>> Given this, Igal's answer seems promising because we have the
>>> FunctionProvider instantiating a local variable and passing it down on
>>> every instantiation. I'm assuming there is one FunctionProvider per
>>> TaskManager. Is there an easy way to have the FunctionProvider receiving
>>> data coming from a Flink DataStream, or receiving StateFun messages?
>>> Otherwise, I could have it subscribe to a Kafka topic directly.
>>>
>>> I really appreciate your help.
>>>
>>> Miguel
>>>
>>> Igal Shilman <i...@ververica.com> escreveu no dia segunda, 22/02/2021
>>> à(s) 12:09:
>>>
>>>> Hi Miguel,
>>>>
>>>> I think that there are a couple of ways to achieve this, and it really
>>>> depends on your specific use case, and the trade-offs
>>>> that you are willing to accept.
>>>>
>>>> For example, one way to approach this:
>>>> - Suppose you have an external service somewhere that returns a
>>>> representation of the logic to be interpreted by
>>>> your function at runtime (I think that is the scenario you are
>>>> describing)
>>>> - Then, you can write a background task (a thread) that periodically
>>>> queries that service, and keeps in memory the latest version.
>>>> - You can initialize this background task in your FunctionProvider
>>>> implementation, or even in your StatefulModule if you wish.
>>>> - Then, make sure that your dynamic stateful function has an access to
>>>> the latest value fetched by your client (for example via a shared reference
>>>> like a j.u.c.AtomicReference)
>>>> - Then on receive, you can simply get that reference and re-apply your
>>>> rules.
>>>>
>>>> Take a look at [1] for example (it is not exactly the same, but I
>>>> believe that it is close enough)
>>>>
>>>> [1]
>>>> https://github.com/apache/flink-statefun/blob/master/statefun-examples/statefun-async-example/src/main/java/org/apache/flink/statefun/examples/async/Module.java
>>>>
>>>> Good luck,
>>>> Igal.
>>>>
>>>>
>>>> On Mon, Feb 22, 2021 at 4:32 AM Tzu-Li (Gordon) Tai <
>>>> tzuli...@apache.org> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> FWIW, there is this JIRA that is tracking a pubsub / broadcast
>>>>> messaging primitive in StateFun:
>>>>> https://issues.apache.org/jira/browse/FLINK-16319
>>>>>
>>>>> This is probably what you are looking for. And I do agree, in the case
>>>>> that the control stream (which updates the application logic) is high
>>>>> volume, redeploying functions may not work well.
>>>>>
>>>>> I don't think there really is a "recommended" way of doing the
>>>>> "broadcast control stream, join with main stream" pattern with StateFun at
>>>>> the moment, at least without FLINK-16319.
>>>>> On the other hand, it could be possible to use stateful functions to
>>>>> implement a pub-sub model in user space for the time being. I've actually
>>>>> left some ideas for implementing that in the comments of FLINK-16319.
>>>>>
>>>>> Cheers,
>>>>> Gordon
>>>>>
>>>>>
>>>>> On Mon, Feb 22, 2021 at 6:38 AM Miguel Araújo <upwarr...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi everyone,
>>>>>>
>>>>>> What is the recommended way of achieving the equivalent of a
>>>>>> broadcast in Flink when using Stateful Functions?
>>>>>>
>>>>>> For instance, assume we are implementing something similar to Flink's
>>>>>> demo fraud detection
>>>>>> <https://flink.apache.org/news/2020/03/24/demo-fraud-detection-2.html> 
>>>>>> but
>>>>>> in Stateful Functions - how can one dynamically update the application's
>>>>>> logic then?
>>>>>> There was a similar question in this mailing list in the past where
>>>>>> it was recommended moving the dynamic logic to a remote function
>>>>>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Stateful-Functions-ML-model-prediction-tp38286p38302.html>
>>>>>>  so
>>>>>> that one could achieve that by deploying a new container. I think that's
>>>>>> not very realistic as updates might happen with a frequency that's not
>>>>>> compatible with that approach (e.g., sticking to the fraud detection
>>>>>> example, updating fraud detection rules every hour is not unusual), nor
>>>>>> should one be deploying a new container when data (not code) changes.
>>>>>>
>>>>>> Is there a way of, for example, modifying FunctionProviders
>>>>>> <https://ci.apache.org/projects/flink/flink-statefun-docs-master/sdk/java.html#function-providers-and-dependency-injection>
>>>>>> on the fly?
>>>>>>
>>>>>> Thanks,
>>>>>> Miguel
>>>>>>
>>>>>

Reply via email to