[ https://issues.apache.org/jira/browse/MINIFICPP-929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Marton Szasz resolved MINIFICPP-929. ------------------------------------ Resolution: Won't Fix abandoned > Create memory map interface to flow files in ProcessSession/ContentRepository > ----------------------------------------------------------------------------- > > Key: MINIFICPP-929 > URL: https://issues.apache.org/jira/browse/MINIFICPP-929 > Project: Apache NiFi MiNiFi C++ > Issue Type: Improvement > Reporter: Andrew Christianson > Assignee: Andrew Christianson > Priority: Minor > Time Spent: 1h 10m > Remaining Estimate: 0h > > Currently, MiNiFi - C++ only support stream-oriented i/o to FlowFile > payloads. This can limit performance in cases where in-place access to the > payload is desirable. In cases where data can be accessed randomly and > in-place, a significant speedup can be realized by mapping the payload into > system memory address space. This is natively supported at the kernel level > in Linux, MacOS, and Windows via the mmap() interface on files. Other > repositories, such as the VolatileRepository, already store the entire > payload in memory, so it is natural to pass through this memory block as if > it were a memory-mapped file. While the DatabaseContentRepostory does not > appear to natively support a memory map interface, accesses via an emulated > memory-map interface should be possible with no performance degradation with > respect to a full read via the streaming interface. > Cases where in-place, random access is beneficial include, but are not > limited to: > * in-place parsing of JSON (e.g. RapidJSON supports parsing in-place, at > least for strings). > * access of payload via protocol buffers > * random access of large files on disk, where it would otherwise require > many seek() and read() syscalls > The interface should be accessible by processors via a mmap() call on > ProcessSession (adjacent to read() and write()). A MemoryMapCallback should > be provided, which is called back via a process() call where the argument is > an instance of BaseMemoryMap. The BaseMemoryMap is extended for each type of > repository that MiNiFi - C++ supports, including: FileSystemRepository, > VolatileRepository, and DatabaseContentRepository. > As part of the change, in addition to extensive unit test coverage, > benchmarks should be written such that the performance impact can be > empirically measured and evaluated. -- This message was sent by Atlassian Jira (v8.20.10#820010)