Hi Stephan,

I am using Flink 1.2.0 version, and running the job on on YARN using
c3.4xlarge EC2 instances having 16 cores and 30GB RAM each.

In total I have set 32 slots and alloted 1200 network buffers

I have attached the latest checkpointing snapshot, its configuration, cpu
load average ,physical memory usage and heap memory usage here:
<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/n11759/Checkpoints_State.png>
 
<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/n11759/Checkpoint_Configuration.png>
 
<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/n11759/CPU_Load_Physical_memory.png>
 
<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/n11759/Heap_Usage.png>
 

Before I describe the topology I want to let you know that when I enabled
object reuse, 32M records (total 64M - two kafka source ) were processed in
17minutes , I did not see much halt in between , how does the object reuse
help here , I have used FLASH_SSD_OPTIMIZED option ? This is the best result
I have got till now (earlier time was 1hr 3minutes). But I don't understand
how did it work ? :)

The program use the following operations:
1. Consume Data from two kafka sources
2. Extract the information from the record (flatmap)
3. Write as is data to S3  (sink)
4. Union both the streams and apply tumbling window on it to perform outer
join (keyBy->window->apply)
5. Some other operators downstream to enrich the data (map->flatMap->map)
6. Write the enriched data to S3 (sink)

I have allocated 8GB of heap space to each TM (find the 4th snap above)

Final aim is to test with minimum 100M records.

Let me know your inputs

Regards,
Vinay Patil





--
View this message in context: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Re-Checkpointing-with-RocksDB-as-statebackend-tp11752p11759.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at 
Nabble.com.

Reply via email to