Yingchun Lai created KUDU-3400: ---------------------------------- Summary: CompilationManager::RequestRowProjector consumed too much memory Key: KUDU-3400 URL: https://issues.apache.org/jira/browse/KUDU-3400 Project: Kudu Issue Type: Bug Components: codegen Affects Versions: 1.12.0 Reporter: Yingchun Lai
In one of our cluster, we find that CompilationManager::RequestRowProjector function consumed too much memory accidentally. Some situaction of this cluster: # some tables have more than 1000 columns, so the table schema may be very costly to copy # sometimes the tservers have memory pressure, and then do flush operations more frequently (to try to reduce memory consumed by MRS/DMS) I catched a heap profile on a tserver, found out that CompilationManager::RequestRowProjector cost most memory when Schema copied, the source code: {code:java} CompilationTask(const Schema& base, const Schema& proj, CodeCache* cache, CodeGenerator* generator) : base_(base), proj_(proj), cache_(cache), generator_(generator) {} {code} That is to say, Schemas (i.e. base and proj) are copied when construct CompilationTask objects. The heap profile says that Schema consumed about 50GB memory, that really shock me, even though the Schema is large, but how can it consumed 50GB memory? I forget to `pstack` the process when it happend, maybe there are hundreds of thousands of CompilationManager::RequestRowProjector calls that time, but according to the code logic, it should not hang there for a long time? {code:java} if (!cached) { shared_ptr<CompilationTask> task(make_shared<CompilationTask>( *base_schema, *projection, &cache_, &generator_)); WARN_NOT_OK_EVERY_N_SECS(pool_->Submit([task]() { task->Run(); }), "RowProjector compilation request submit failed", 10); return false; } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)