Could you please set 2 configuration options: - state.backend.rocksdb.predefined-options = SPINNING_DISK_OPTIMIZED_HIGH_MEM - state.backend.rocksdb.memory.partitioned-index-filters = true
Regards, Maciek sob., 10 lip 2021 o 08:54 Adrian Bednarz <adrianbedn...@gmail.com> napisał(a): > > I didn’t tweak any RocksDB knobs. The only thing we did was to increase > managed memory to 12gb which was supposed to help RocksDB according to the > documentation. The rest stays at the defaults. Incremental checkpointing was > enabled as well but it made no difference in performance if we disabled it. > > On Fri, 9 Jul 2021 at 20:43, Maciej Bryński <mac...@brynski.pl> wrote: >> >> Hi Adrian, >> Could you share your state backend configuration ? >> >> Regards, >> Maciek >> >> pt., 9 lip 2021 o 19:09 Adrian Bednarz <adrianbedn...@gmail.com> napisał(a): >> > >> > Hello, >> > >> > We are experimenting with lookup joins in Flink 1.13.0. Unfortunately, we >> > unexpectedly hit significant performance degradation when changing the >> > state backend to RocksDB. >> > >> > We performed tests with two tables: fact table TXN and dimension table >> > CUSTOMER with the following schemas: >> > >> > TXN: >> > |-- PROD_ID: BIGINT >> > |-- CUST_ID: BIGINT >> > |-- TYPE: BIGINT >> > |-- AMOUNT: BIGINT >> > |-- ITEMS: BIGINT >> > |-- TS: TIMESTAMP(3) **rowtime** >> > |-- WATERMARK FOR TS: TIMESTAMP(3) AS `R` - INTERVAL '0' SECONDS >> > >> > CUSTOMER: >> > |-- ID: BIGINT >> > |-- STATE: BIGINT >> > |-- AGE: BIGINT >> > |-- SCORE: DOUBLE >> > |-- PRIMARY KEY: ID >> > >> > And the following query: >> > select state, sum(amount) from txn t JOIN customer FOR SYSTEM_TIME AS OF >> > t.ts ON t.cust_id = customer.id group by state, TUMBLE(t.ts, INTERVAL '1' >> > SECOND) >> > >> > In our catalog, we reconfigured the customer table so that the watermark >> > is set to infinity on that side of the join. We generate data in a round >> > robin fashion (except for timestamp that grows with a step of 1 ms). >> > >> > We performed our experiments on a single c5.4xlarge machine with heap and >> > managed memory size set to 12gb with a blackhole sink. With 2 000 000 fact >> > records and 100 000 dimension records, a job with heap backend finishes in >> > 5 seconds whereas RocksDB executes in 1h 24m. For 400 000 dimension >> > records it doesn't grow significantly but goes up to 1h 36m (the job >> > processes more records after all). >> > >> > We also checked what would happen if we reduced the amount of customer ids >> > to 1. Our expectation was that RocksDB will not offload anything to disk >> > anymore so the performance should be comparable with heap backend. It was >> > executed in 10 minutes. >> > >> > Is this something anybody experienced or something to be expected? Of >> > course, we assumed RocksDB to perform slower but 300 eps is below our >> > expectations. >> > >> > Thanks, >> > Adrian >> >> >> >> -- >> Maciek Bryński -- Maciek Bryński