Re: Subpar performance of temporal joins with RocksDB backend

Maciej Bryński Sat, 10 Jul 2021 00:28:47 -0700

Could you please set 2 configuration options:
- state.backend.rocksdb.predefined-options = SPINNING_DISK_OPTIMIZED_HIGH_MEM
- state.backend.rocksdb.memory.partitioned-index-filters = true


Regards,
Maciek

sob., 10 lip 2021 o 08:54 Adrian Bednarz <adrianbedn...@gmail.com> napisał(a):
>
> I didn’t tweak any RocksDB knobs. The only thing we did was to increase 
> managed memory to 12gb which was supposed to help RocksDB according to the 
> documentation. The rest stays at the defaults. Incremental checkpointing was 
> enabled as well but it made no difference in performance if we disabled it.
>
> On Fri, 9 Jul 2021 at 20:43, Maciej Bryński <mac...@brynski.pl> wrote:
>>
>> Hi Adrian,
>> Could you share your state backend configuration ?
>>
>> Regards,
>> Maciek
>>
>> pt., 9 lip 2021 o 19:09 Adrian Bednarz <adrianbedn...@gmail.com> napisał(a):
>> >
>> > Hello,
>> >
>> > We are experimenting with lookup joins in Flink 1.13.0. Unfortunately, we 
>> > unexpectedly hit significant performance degradation when changing the 
>> > state backend to RocksDB.
>> >
>> > We performed tests with two tables: fact table TXN and dimension table 
>> > CUSTOMER with the following schemas:
>> >
>> > TXN:
>> >  |-- PROD_ID: BIGINT
>> >  |-- CUST_ID: BIGINT
>> >  |-- TYPE: BIGINT
>> >  |-- AMOUNT: BIGINT
>> >  |-- ITEMS: BIGINT
>> >  |-- TS: TIMESTAMP(3) **rowtime**
>> >  |-- WATERMARK FOR TS: TIMESTAMP(3) AS `R` - INTERVAL '0' SECONDS
>> >
>> > CUSTOMER:
>> >  |-- ID: BIGINT
>> >  |-- STATE: BIGINT
>> >  |-- AGE: BIGINT
>> >  |-- SCORE: DOUBLE
>> >  |-- PRIMARY KEY: ID
>> >
>> > And the following query:
>> > select state, sum(amount) from txn t JOIN customer FOR SYSTEM_TIME AS OF 
>> > t.ts ON t.cust_id = customer.id group by state, TUMBLE(t.ts, INTERVAL '1' 
>> > SECOND)
>> >
>> > In our catalog, we reconfigured the customer table so that the watermark 
>> > is set to infinity on that side of the join. We generate data in a round 
>> > robin fashion (except for timestamp that grows with a step of 1 ms).
>> >
>> > We performed our experiments on a single c5.4xlarge machine with heap and 
>> > managed memory size set to 12gb with a blackhole sink. With 2 000 000 fact 
>> > records and 100 000 dimension records, a job with heap backend finishes in 
>> > 5 seconds whereas RocksDB executes in 1h 24m. For 400 000 dimension 
>> > records it doesn't grow significantly but goes up to 1h 36m (the job 
>> > processes more records after all).
>> >
>> > We also checked what would happen if we reduced the amount of customer ids 
>> > to 1. Our expectation was that RocksDB will not offload anything to disk 
>> > anymore so the performance should be comparable with heap backend. It was 
>> > executed in 10 minutes.
>> >
>> > Is this something anybody experienced or something to be expected? Of 
>> > course, we assumed RocksDB to perform slower but 300 eps is below our 
>> > expectations.
>> >
>> > Thanks,
>> > Adrian
>>
>>
>>
>> --
>> Maciek Bryński



-- 
Maciek Bryński

Re: Subpar performance of temporal joins with RocksDB backend

Reply via email to