Hello! In case of long-lived, low-intensity streaming, Data Streamer will not be able to utilize its client-side per-partition batching capabilities, instead being just a wrapper over cache update operations, which are available as part of Cache API.
Regards, -- Ilya Kasnacheev вт, 4 февр. 2020 г. в 03:41, Denis Magda <dma...@apache.org>: > Ilya, > > I don't quite understand why data streamer is not suitable as a > long-running solution. Please don't mislead, otherwise, list out specific > limitations. I don't see anything wrong by having an opened data > streamer that transfer data to Ignite in real-time. > > Narges, if the streamer crashes then your service/app needs to resend > those records that were not acknowledged. Probably, you might utilize Kafka > Connect here that keeps track of committed/pending records. > > - > Denis > > > On Mon, Feb 3, 2020 at 6:13 AM Ilya Kasnacheev <ilya.kasnach...@gmail.com> > wrote: > >> Hello! >> >> I think these benefits are imaginary. You will have to worry about >> service more, rather about data streamer which may be recreated at any time. >> >> Regards, >> -- >> Ilya Kasnacheev >> >> >> пн, 3 февр. 2020 г. в 16:58, narges saleh <snarges...@gmail.com>: >> >>> Thanks Ilya. >>> I have to listen to these burst of data which arrive every few seconds >>> meaning an almost constant bursts of data from different data sources. >>> The main reason that the services grid is appealing to me is its >>> resiliency; I don't have to worry about it. With the client side streamer, >>> I will have to deploy it and keep it up running, and load/re balance it. >>> >>> On Mon, Feb 3, 2020 at 7:17 AM Ilya Kasnacheev < >>> ilya.kasnach...@gmail.com> wrote: >>> >>>> Hello! >>>> >>>> I don't see why you would deploy it as a service, sounds like you will >>>> have to send more data over network. If you have to pull batches in, then >>>> service should work. I recommend re-acquiring data streamer for each batch. >>>> >>>> Please note that Data Streamer is very scalable, so it is preferred to >>>> tune it than trying to use more than one streamer. >>>> >>>> Regards, >>>> -- >>>> Ilya Kasnacheev >>>> >>>> >>>> пн, 3 февр. 2020 г. в 16:11, narges saleh <snarges...@gmail.com>: >>>> >>>>> Hi Ilya >>>>> The data comes in huge batches of records (each burst can be up to >>>>> 50-100 MB, which I plan to spread across multiple streamers) so, the >>>>> streamer seems to be the way to go. Also, I don't want to establish a JDBC >>>>> connection each time. >>>>> So, if the streamer is the way to go, is it feasible to deploy it as a >>>>> service? >>>>> thanks. >>>>> >>>>> On Mon, Feb 3, 2020 at 6:51 AM Ilya Kasnacheev < >>>>> ilya.kasnach...@gmail.com> wrote: >>>>> >>>>>> Hello! >>>>>> >>>>>> Contrary to its name, data streamer is not actually suitable for >>>>>> long-lived, low-intensity streaming. What it's good for is burst load of >>>>>> large number of data in a short period of time. >>>>>> >>>>>> If your data arrives in large batches, you can use Data Streamer for >>>>>> each batch. If not, better use Cache API. >>>>>> >>>>>> If you are worried that plain Cache API is slow, but also want >>>>>> failure resilience, there's catch-22. The only way to make something >>>>>> resilient is to put it into cache :) >>>>>> >>>>>> Regards, >>>>>> -- >>>>>> Ilya Kasnacheev >>>>>> >>>>>> >>>>>> пн, 3 февр. 2020 г. в 14:34, narges saleh <snarges...@gmail.com>: >>>>>> >>>>>>> Hi, >>>>>>> But services are by definition long lived, right? Here is my layout: >>>>>>> The data is continuously generated and sent to the streamer services >>>>>>> (via >>>>>>> JDBC connection with set streaming on option), deployed, say, as node >>>>>>> singleton (actually deployed also as microservices) to load the data >>>>>>> into >>>>>>> the caches. The streamers do flush data based on some timers. >>>>>>> If the streamer crashes before the buffer is flushed, the client >>>>>>> catches the exception and resends the batch. Any issue with this layout? >>>>>>> >>>>>>> thanks. >>>>>>> >>>>>>> On Mon, Feb 3, 2020 at 5:02 AM Ilya Kasnacheev < >>>>>>> ilya.kasnach...@gmail.com> wrote: >>>>>>> >>>>>>>> Hello! >>>>>>>> >>>>>>>> It is not recommended to have long-lived data streamers, it's best >>>>>>>> to acquire it when it is needed. >>>>>>>> >>>>>>>> If you have to keep data streamer around, don't forget to flush() >>>>>>>> it. This way you don't have to worry about its queue. >>>>>>>> >>>>>>>> Regards, >>>>>>>> -- >>>>>>>> Ilya Kasnacheev >>>>>>>> >>>>>>>> >>>>>>>> пн, 3 февр. 2020 г. в 13:24, narges saleh <snarges...@gmail.com>: >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> My specific question/concern is with regard to the state of the >>>>>>>>> streamer when it run as a service, i.e. when it crashes and it gets >>>>>>>>> redeployed. Specifically, what happens to the data? >>>>>>>>> I have a similar question with regard to the state of a continuous >>>>>>>>> query when it is deployed as a service, what happens to the data in >>>>>>>>> the >>>>>>>>> listener's queue? >>>>>>>>> >>>>>>>>> thanks. >>>>>>>>> >>>>>>>>> On Sun, Feb 2, 2020 at 4:18 PM Mikael <mikael-arons...@telia.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi! >>>>>>>>>> >>>>>>>>>> Not as far as I know, I have a number of services using streamers >>>>>>>>>> without any problems, do you have any specific problem with it ? >>>>>>>>>> >>>>>>>>>> Mikael >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Den 2020-02-02 kl. 22:33, skrev narges saleh: >>>>>>>>>> > Hi All, >>>>>>>>>> > >>>>>>>>>> > Is there a problem with running the datastreamer as a service, >>>>>>>>>> being >>>>>>>>>> > instantiated in init method? Or loading the data via JDBC >>>>>>>>>> connection >>>>>>>>>> > with streaming mode enabled? >>>>>>>>>> > In either case, the deployment is affinity based. >>>>>>>>>> > >>>>>>>>>> > thanks. >>>>>>>>>> >>>>>>>>>