Re: Checkpointing with RocksDB as statebackend

Stephan Ewen Tue, 21 Feb 2017 10:46:34 -0800

Hi!

I cannot find the screenshots you attached.
The Apache Mailing lists sometimes don't support attachments, can you link
to the screenshots some way else?


Stephan


On Mon, Feb 20, 2017 at 8:36 PM, vinay patil <[email protected]>
wrote:

> Hi Stephan,
>
> Just saw your mail while I was explaining the answer to your earlier
> questions. I have attached some more screenshots which are taken from the
> latest run today.
> Yes I will try to set it to higher value and check if performance improves
>
> Let me know your thoughts
>
> Regards,
> Vinay Patil
>
> On Tue, Feb 21, 2017 at 12:51 AM, Stephan Ewen [via Apache Flink User
> Mailing List archive.] <[hidden email]
> <http:///user/SendEmail.jtp?type=node&node=11760&i=0>> wrote:
>
>> @Vinay!
>>
>> Just saw the screenshot you attached to the first mail. The checkpoint
>> that failed came after one that had an incredible heavy alignment phase (14
>> GB).
>> I think that working that off threw the next checkpoint because the
>> workers were still working off the alignment backlog.
>>
>> I think you can for now fix this by setting the minimum pause between
>> checkpoints a bit higher (it is probably set a bit too small for the state
>> of your application).
>>
>> Also, can you describe what your sources are (Kafka / Kinesis or file
>> system)?
>>
>> BTW: We are currently working on
>>   - incremental RocksDB checkpoints
>>   - the network stack to allow in the future for a new way of doing the
>> alignment
>>
>> Both of that should help that the program is more resilient to these
>> situations.
>>
>> Best,
>> Stephan
>>
>>
>>
>> On Mon, Feb 20, 2017 at 7:51 PM, Stephan Ewen <[hidden email]
>> <http:///user/SendEmail.jtp?type=node&node=11758&i=0>> wrote:
>>
>>> Hi Vinay!
>>>
>>> Can you start by giving us a bit of an environment spec?
>>>
>>>   - What Flink version are you using?
>>>   - What is your rough topology (what operations does the program use)
>>>   - Where is the state (windows, keyBy)?
>>>   - What is the rough size of your checkpoints and where does the time
>>> go? Can you attach a screenshot from https://ci.apache.org/pro
>>> jects/flink/flink-docs-release-1.2/monitoring/checkpoint_monitoring.html
>>>   - What is the size of the JVM?
>>>
>>> Those things would be helpful to know...
>>>
>>> Best,
>>> Stephan
>>>
>>>
>>> On Mon, Feb 20, 2017 at 7:04 PM, vinay patil <[hidden email]
>>> <http:///user/SendEmail.jtp?type=node&node=11758&i=1>> wrote:
>>>
>>>> Hi Xiaogang,
>>>>
>>>> Thank you for your inputs.
>>>>
>>>> Yes I have already tried setting MaxBackgroundFlushes and
>>>> MaxBackgroundCompactions to higher value (tried with 2, 4, 8) , still not
>>>> getting expected results.
>>>>
>>>> System.getProperty("java.io.tmpdir") points to /tmp but there I could
>>>> not find RocksDB logs, can you please let me know where can I find it ?
>>>>
>>>> Regards,
>>>> Vinay Patil
>>>>
>>>> On Mon, Feb 20, 2017 at 7:32 AM, xiaogang.sxg [via Apache Flink User
>>>> Mailing List archive.] <[hidden email]
>>>> <http:///user/SendEmail.jtp?type=node&node=11752&i=0>> wrote:
>>>>
>>>>> Hi Vinay
>>>>>
>>>>> Can you provide the LOG file in RocksDB? It helps a lot to figure out
>>>>> the problems becuse it records the options and the events happened
>>>>> during the execution. Otherwise configured, it should locate at the
>>>>> path set in System.getProperty("java.io.tmpdir").
>>>>>
>>>>> Typically, a large amount of memory is consumed by RocksDB to store
>>>>> necessary indices. To avoid the unlimited growth in the memory 
>>>>> consumption,
>>>>> you can put these indices into block cache (set CacheIndexAndFilterBlock 
>>>>> to
>>>>> true) and properly set the block cache size.
>>>>>
>>>>> You can also increase the number of backgroud threads to improve the
>>>>> performance of flushes and compactions (via MaxBackgroundFlushes and
>>>>> MaxBackgroudCompactions).
>>>>>
>>>>> In YARN clusters, task managers will be killed if their memory
>>>>> utilization exceeds the allocation size. Currently Flink does not count 
>>>>> the
>>>>> memory used by RocksDB in the allocation. We are working on fine-grained
>>>>> resource allocation (see FLINK-5131). It may help to avoid such problems.
>>>>>
>>>>> May the information helps you.
>>>>>
>>>>> Regards,
>>>>> Xiaogang
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------
>>>>> 发件人：Vinay Patil <[hidden email]
>>>>> <http:///user/SendEmail.jtp?type=node&node=11731&i=0>>
>>>>> 发送时间：2017年2月17日(星期五) 21:19
>>>>> 收件人：user <[hidden email]
>>>>> <http:///user/SendEmail.jtp?type=node&node=11731&i=1>>
>>>>> 主 题：Re: Checkpointing with RocksDB as statebackend
>>>>>
>>>>> Hi Guys,
>>>>>
>>>>> There seems to be some issue with RocksDB memory utilization.
>>>>>
>>>>> Within few minutes of job run the physical memory usage increases by
>>>>> 4-5 GB and it keeps on increasing.
>>>>> I have tried different options for Max Buffer Size(30MB, 64MB, 128MB ,
>>>>> 512MB) and Min Buffer to Merge as 2, but the physical memory keeps on
>>>>> increasing.
>>>>>
>>>>> According to RocksDB documentation, these are the main options on
>>>>> which flushing to storage is based.
>>>>>
>>>>> Can you please point me where am I doing wrong. I have tried different
>>>>> configuration options but each time the Task Manager is getting killed
>>>>> after some time :)
>>>>>
>>>>> Regards,
>>>>> Vinay Patil
>>>>>
>>>>> On Thu, Feb 16, 2017 at 6:02 PM, Vinay Patil <[hidden email]
>>>>> <http:///user/SendEmail.jtp?type=node&node=11731&i=2>> wrote:
>>>>> I think its more of related to RocksDB, I am also not aware about
>>>>> RocksDB but reading the tuning guide to understand the important values
>>>>> that can be set
>>>>>
>>>>> Regards,
>>>>> Vinay Patil
>>>>>
>>>>> On Thu, Feb 16, 2017 at 5:48 PM, Stefan Richter [via Apache Flink User
>>>>> Mailing List archive.] <[hidden email]
>>>>> <http:///user/SendEmail.jtp?type=node&node=11731&i=3>> wrote:
>>>>> What kind of problem are we talking about? S3 related or RocksDB
>>>>> related. I am not aware of problems with RocksDB per se. I think seeing
>>>>> logs for this would be very helpful.
>>>>>
>>>>> Am 16.02.2017 um 11:56 schrieb Aljoscha Krettek <[hidden email]
>>>>> <http:///user/SendEmail.jtp?type=node&node=11673&i=0>>:
>>>>>
>>>>> [hidden email] <http:///user/SendEmail.jtp?type=node&node=11673&i=1>
>>>>>  and [hidden email]
>>>>> <http:///user/SendEmail.jtp?type=node&node=11673&i=2> could this be
>>>>> the same problem that you recently saw when working with other people?
>>>>>
>>>>> On Wed, 15 Feb 2017 at 17:23 Vinay Patil <[hidden email]
>>>>> <http:///user/SendEmail.jtp?type=node&node=11673&i=3>> wrote:
>>>>> Hi Guys,
>>>>>
>>>>> Can anyone please help me with this issue
>>>>>
>>>>> Regards,
>>>>> Vinay Patil
>>>>>
>>>>> On Wed, Feb 15, 2017 at 6:17 PM, Vinay Patil <[hidden email]
>>>>> <http:///user/SendEmail.jtp?type=node&node=11673&i=4>> wrote:
>>>>> Hi Ted,
>>>>>
>>>>> I have 3 boxes in my pipeline , 1st and 2nd box containing source and
>>>>> s3 sink and the 3rd box is window operator followed by chained operators
>>>>> and a s3 sink
>>>>>
>>>>> So in the details link section I can see that that S3 sink is taking
>>>>> time for the acknowledgement and it is not even going to the window
>>>>> operator chain.
>>>>>
>>>>> But as shown in the snapshot ,checkpoint id 19 did not get any
>>>>> acknowledgement. Not sure what is causing the issue
>>>>>
>>>>> Regards,
>>>>> Vinay Patil
>>>>>
>>>>> On Wed, Feb 15, 2017 at 5:51 PM, Ted Yu [via Apache Flink User Mailing
>>>>> List archive.] <[hidden email]
>>>>> <http:///user/SendEmail.jtp?type=node&node=11673&i=5>> wrote:
>>>>> What did the More Details link say ?
>>>>>
>>>>> Thanks
>>>>>
>>>>> > On Feb 15, 2017, at 3:11 AM, vinay patil <[hidden email]
>>>>> <http://user/SendEmail.jtp?type=node&node=11641&i=0>> wrote:
>>>>> >
>>>>> > Hi,
>>>>> >
>>>>> > I have kept the checkpointing interval to 6secs and minimum pause
>>>>> between
>>>>> > checkpoints to 5secs, while testing the pipeline I have observed
>>>>> that that
>>>>> > for some checkpoints it is taking long time , as you can see in the
>>>>> attached
>>>>> > snapshot checkpoint id 19 took the maximum time before it gets
>>>>> failed,
>>>>> > although it has not received any acknowledgements, now during this
>>>>> 10minutes
>>>>> > the entire pipeline did not make any progress and no data was
>>>>> getting
>>>>> > processed. (For Ex : In 13minutes 20M records were processed and
>>>>> when the
>>>>> > checkpoint took time there was no progress for the next 10minutes)
>>>>> >
>>>>> > I have even tried to set max checkpoint timeout to 3min, but in that
>>>>> case as
>>>>> > well multiple checkpoints were getting failed.
>>>>> >
>>>>> > I have set RocksDB FLASH_SSD_OPTION
>>>>> > What could be the issue ?
>>>>> >
>>>>> > P.S. I am writing to 3 S3 sinks
>>>>> >
>>>>> > checkpointing_issue.PNG
>>>>> > <http://apache-flink-user-mailing-list-archive.2336050.n4.na
>>>>> bble.com/file/n11640/checkpointing_issue.PNG>
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> > View this message in context: http://apache-flink-user-maili
>>>>> ng-list-archive.2336050.n4.nabble.com/Checkpointing-with-Roc
>>>>> ksDB-as-statebackend-tp11640.html
>>>>> > Sent from the Apache Flink User Mailing List archive. mailing list
>>>>> archive at Nabble.com.
>>>>> ------------------------------
>>>>> If you reply to this email, your message will be added to the
>>>>> discussion below:
>>>>> http://apache-flink-user-mailing-list-archive.2336050.n4.nab
>>>>> ble.com/Checkpointing-with-RocksDB-as-statebackend-tp11640p11641.html
>>>>> To start a new topic under Apache Flink User Mailing List archive.,
>>>>> email [hidden email]
>>>>> <http:///user/SendEmail.jtp?type=node&node=11673&i=6>
>>>>> To unsubscribe from Apache Flink User Mailing List archive., click
>>>>> here
>>>>> <#m_-110681228480864290_m_-370635408291964005_m_3724869264661144930_m_6198963695418156302_m_8892162958879126193_this>
>>>>> .
>>>>> NAML
>>>>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>>>>
>>>>>
>>>>>
>>>>> ------------------------------
>>>>> If you reply to this email, your message will be added to the
>>>>> discussion below:
>>>>> http://apache-flink-user-mailing-list-archive.2336050.n4.nab
>>>>> ble.com/Checkpointing-with-RocksDB-as-statebackend-tp11640p11673.html
>>>>> To start a new topic under Apache Flink User Mailing List archive.,
>>>>> email [hidden email]
>>>>> <http:///user/SendEmail.jtp?type=node&node=11731&i=4>
>>>>> To unsubscribe from Apache Flink User Mailing List archive., click
>>>>> here.
>>>>> NAML
>>>>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>>>>
>>>>>
>>>
>>
>> ------------------------------
>> If you reply to this email, your message will be added to the discussion
>> below:
>> http://apache-flink-user-mailing-list-archive.2336050.n4.nab
>> ble.com/Checkpointing-with-RocksDB-as-statebackend-tp11640p11731.html
>> To start a new topic under Apache Flink User Mailing List archive., email 
>> [hidden
>> email] <http:///user/SendEmail.jtp?type=node&node=11752&i=1>
>> To unsubscribe from Apache Flink User Mailing List archive., click here.
>> NAML
>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>
>>
>> ------------------------------
>> View this message in context: Re: Checkpointing with RocksDB as
>> statebackend
>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Re-Checkpointing-with-RocksDB-as-statebackend-tp11752.html>
>>
>> Sent from the Apache Flink User Mailing List archive. mailing list
>> archive
>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/>
>> at Nabble.com.
>>
>
>
>
>
> ------------------------------
> If you reply to this email, your message will be added to the discussion
> below:
> http://apache-flink-user-mailing-list-archive.2336050.n4.
> nabble.com/Re-Checkpointing-with-RocksDB-as-statebackend-
> tp11752p11758.html
> To start a new topic under Apache Flink User Mailing List archive., email 
> [hidden
> email] <http:///user/SendEmail.jtp?type=node&node=11760&i=1>
> To unsubscribe from Apache Flink User Mailing List archive., click here.
> NAML
> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>
>
> ------------------------------
> View this message in context: Re: Checkpointing with RocksDB as
> statebackend
> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Re-Checkpointing-with-RocksDB-as-statebackend-tp11752p11760.html>
> Sent from the Apache Flink User Mailing List archive. mailing list archive
> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/> at
> Nabble.com.
>

Re: Checkpointing with RocksDB as statebackend

Reply via email to