[ 
https://issues.apache.org/jira/browse/MINIFICPP-929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Szasz resolved MINIFICPP-929.
------------------------------------
    Resolution: Won't Fix

abandoned

> Create memory map interface to flow files in ProcessSession/ContentRepository
> -----------------------------------------------------------------------------
>
>                 Key: MINIFICPP-929
>                 URL: https://issues.apache.org/jira/browse/MINIFICPP-929
>             Project: Apache NiFi MiNiFi C++
>          Issue Type: Improvement
>            Reporter: Andrew Christianson
>            Assignee: Andrew Christianson
>            Priority: Minor
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Currently, MiNiFi - C++ only support stream-oriented i/o to FlowFile 
> payloads. This can limit performance in cases where in-place access to the 
> payload is desirable. In cases where data can be accessed randomly and 
> in-place, a significant speedup can be realized by mapping the payload into 
> system memory address space. This is natively supported at the kernel level 
> in Linux, MacOS, and Windows via the mmap() interface on files. Other 
> repositories, such as the VolatileRepository, already store the entire 
> payload in memory, so it is natural to pass through this memory block as if 
> it were a memory-mapped file. While the DatabaseContentRepostory does not 
> appear to natively support a memory map interface, accesses via an emulated 
> memory-map interface should be possible with no performance degradation with 
> respect to a full read via the streaming interface.
> Cases where in-place, random access is beneficial include, but are not 
> limited to:
>  * in-place parsing of JSON (e.g. RapidJSON supports parsing in-place, at 
> least for strings).
>  * access of payload via protocol buffers
>  * random access of large files on disk, where it would otherwise require 
> many seek() and read() syscalls
> The interface should be accessible by processors via a mmap() call on 
> ProcessSession (adjacent to read() and write()). A MemoryMapCallback should 
> be provided, which is called back via a process() call where the argument is 
> an instance of BaseMemoryMap. The BaseMemoryMap is extended for each type of 
> repository that MiNiFi - C++ supports, including: FileSystemRepository, 
> VolatileRepository, and DatabaseContentRepository.
> As part of the change, in addition to extensive unit test coverage, 
> benchmarks should be written such that the performance impact can be 
> empirically measured and evaluated.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to