Hi Gyula,

I have a couple of operator on the pipeline. Filter, mapper, flatMap, and
each of these operator contains some cache data.

So i think that means for every other operator on the pipeline, i will need
to add a new stream to update each cache data.


Cheers

On Thu, Aug 20, 2015 at 5:33 PM, Gyula Fóra <gyula.f...@gmail.com> wrote:

> Hi,
>
> I don't think I fully understand your question, could you please try to be
> a little more specific?
>
> I assume by caching you mean that you keep the current model as an
> operator state. Why would you need to add new streams in this case?
>
> I might be slow to answer as I am currently on vacation without stable
> internet connection.
>
> Cheers,
> Gyula
>
> On Thu, Aug 20, 2015 at 5:36 AM Welly Tambunan <if05...@gmail.com> wrote:
>
>> Hi Gyula,
>>
>> I have another question. So if i cache something on the operator, to keep
>> it up to date,  i will always need to add and connect another stream of
>> changes to the operator ?
>>
>> Is this right for every case ?
>>
>> Cheers
>>
>> On Wed, Aug 19, 2015 at 3:21 PM, Welly Tambunan <if05...@gmail.com>
>> wrote:
>>
>>> Hi Gyula,
>>>
>>> That's really helpful. The docs is improving so much since the last time
>>> (0.9).
>>>
>>> Thanks a lot !
>>>
>>> Cheers
>>>
>>> On Wed, Aug 19, 2015 at 3:07 PM, Gyula Fóra <gyula.f...@gmail.com>
>>> wrote:
>>>
>>>> Hey,
>>>>
>>>> If it is always better to check the events against a more up-to-date
>>>> model (even if the events we are checking arrived before the update) then
>>>> it is fine to keep the model outside of the system.
>>>>
>>>> In this case we need to make sure that we can push the updates to the
>>>> external system consistently. If you are using the PersistentKafkaSource
>>>> for instance it can happen that some messages are replayed in case of
>>>> failure. In this case you need to make sure that you remove duplicate
>>>> updates or have idempotent updates.
>>>>
>>>> You can read about the checkpoint mechanism in the Flink website:
>>>> https://ci.apache.org/projects/flink/flink-docs-master/internals/stream_checkpointing.html
>>>>
>>>> Cheers,
>>>> Gyula
>>>>
>>>> On Wed, Aug 19, 2015 at 9:56 AM Welly Tambunan <if05...@gmail.com>
>>>> wrote:
>>>>
>>>>> Thanks Gyula,
>>>>>
>>>>> Another question i have..
>>>>>
>>>>> > ... while external model updates would be *tricky *to keep
>>>>> consistent.
>>>>> Is that still the case if the Operator treat the external model as
>>>>> read-only ? We create another stream that will update the external model
>>>>> separately.
>>>>>
>>>>> Could you please elaborate more about this one ?
>>>>>
>>>>> Cheers
>>>>>
>>>>> On Wed, Aug 19, 2015 at 2:52 PM, Gyula Fóra <gyula.f...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> In that case I would apply a map to wrap in some common type, like a
>>>>>> n Either<t1,t2> before the union.
>>>>>>
>>>>>> And then in the coflatmap you can unwrap it.
>>>>>> On Wed, Aug 19, 2015 at 9:50 AM Welly Tambunan <if05...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Gyula,
>>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>> However update1 and update2 have a different type. Based on my
>>>>>>> understanding, i don't think we can use union. How can we handle this 
>>>>>>> one ?
>>>>>>>
>>>>>>> We like to create our event strongly type to get the domain language
>>>>>>> captured.
>>>>>>>
>>>>>>>
>>>>>>> Cheers
>>>>>>>
>>>>>>> On Wed, Aug 19, 2015 at 2:37 PM, Gyula Fóra <gyula.f...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hey,
>>>>>>>>
>>>>>>>> One input of your co-flatmap would be model updates and the other
>>>>>>>> input would be events to check against the model if I understand 
>>>>>>>> correctly.
>>>>>>>>
>>>>>>>> This means that if your model updates come from more than one
>>>>>>>> stream you need to union them into a single stream before connecting 
>>>>>>>> them
>>>>>>>> with the event stream and applying the coatmap.
>>>>>>>>
>>>>>>>> DataStream updates1 = ....
>>>>>>>> DataStream updates2 = ....
>>>>>>>> DataStream events = ....
>>>>>>>>
>>>>>>>> events.connect(updates1.union(updates2).broadcast()).flatMap(...)
>>>>>>>>
>>>>>>>> Does this answer your question?
>>>>>>>>
>>>>>>>> Gyula
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wednesday, August 19, 2015, Welly Tambunan <if05...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi Gyula,
>>>>>>>>>
>>>>>>>>> Thanks for your response.
>>>>>>>>>
>>>>>>>>> However the model can received multiple event for update. How can
>>>>>>>>> we do that with co-flatmap as i can see the connect API only received
>>>>>>>>> single datastream ?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> > ... while external model updates would be tricky to keep
>>>>>>>>> consistent.
>>>>>>>>> Is that still the case if the Operator treat the external model as
>>>>>>>>> read-only ? We create another stream that will update the external 
>>>>>>>>> model
>>>>>>>>> separately.
>>>>>>>>>
>>>>>>>>> Cheers
>>>>>>>>>
>>>>>>>>> On Wed, Aug 19, 2015 at 2:05 PM, Gyula Fóra <gyf...@apache.org>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hey!
>>>>>>>>>>
>>>>>>>>>> I think it is safe to say that the best approach in this case is
>>>>>>>>>> creating a co-flatmap that will receive updates on one input. The 
>>>>>>>>>> events
>>>>>>>>>> should probably be broadcasted in this case so you can check in 
>>>>>>>>>> parallel.
>>>>>>>>>>
>>>>>>>>>> This approach can be used effectively with Flink's checkpoint
>>>>>>>>>> mechanism, while external model updates would be tricky to keep 
>>>>>>>>>> consistent.
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Gyula
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Aug 19, 2015 at 8:44 AM Welly Tambunan <if05...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi All,
>>>>>>>>>>>
>>>>>>>>>>> We have a streaming computation that required to validate the
>>>>>>>>>>> data stream against the model provided by the user.
>>>>>>>>>>>
>>>>>>>>>>> Right now what I have done is to load the model into flink
>>>>>>>>>>> operator and then validate against it. However the model can be 
>>>>>>>>>>> updated and
>>>>>>>>>>> changed frequently. Fortunately we always publish this event to 
>>>>>>>>>>> RabbitMQ.
>>>>>>>>>>>
>>>>>>>>>>> I think we can
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>    1. Create RabbitMq listener for model changed event from
>>>>>>>>>>>    inside the operator, then update the model if event arrived.
>>>>>>>>>>>
>>>>>>>>>>>    But i think this will create race condition if not handle
>>>>>>>>>>>    correctly and it seems odd to keep this
>>>>>>>>>>>
>>>>>>>>>>>    2. We can move the model into external in external memory
>>>>>>>>>>>    cache storage and keep the model up to date using flink. So the 
>>>>>>>>>>> operator
>>>>>>>>>>>    will retrieve that from memory cache
>>>>>>>>>>>
>>>>>>>>>>>    3. Create two stream and using co operator for managing the
>>>>>>>>>>>    shared state.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> What is your suggestion on keeping the state up to date from
>>>>>>>>>>> external event ? Is there some kind of best practice for 
>>>>>>>>>>> maintaining model
>>>>>>>>>>> up to date on streaming operator ?
>>>>>>>>>>>
>>>>>>>>>>> Thanks a lot
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Cheers
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Welly Tambunan
>>>>>>>>>>> Triplelands
>>>>>>>>>>>
>>>>>>>>>>> http://weltam.wordpress.com
>>>>>>>>>>> http://www.triplelands.com <http://www.triplelands.com/blog/>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Welly Tambunan
>>>>>>>>> Triplelands
>>>>>>>>>
>>>>>>>>> http://weltam.wordpress.com
>>>>>>>>> http://www.triplelands.com <http://www.triplelands.com/blog/>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Welly Tambunan
>>>>>>> Triplelands
>>>>>>>
>>>>>>> http://weltam.wordpress.com
>>>>>>> http://www.triplelands.com <http://www.triplelands.com/blog/>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Welly Tambunan
>>>>> Triplelands
>>>>>
>>>>> http://weltam.wordpress.com
>>>>> http://www.triplelands.com <http://www.triplelands.com/blog/>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Welly Tambunan
>>> Triplelands
>>>
>>> http://weltam.wordpress.com
>>> http://www.triplelands.com <http://www.triplelands.com/blog/>
>>>
>>
>>
>>
>> --
>> Welly Tambunan
>> Triplelands
>>
>> http://weltam.wordpress.com
>> http://www.triplelands.com <http://www.triplelands.com/blog/>
>>
>


-- 
Welly Tambunan
Triplelands

http://weltam.wordpress.com
http://www.triplelands.com <http://www.triplelands.com/blog/>

Reply via email to