Hi, thanks for your reply.

I really appreciate your work in this library.

I used /bin/time utility of Linux but I also saw the same result with 
another memory analyzer.

As I mentioned, since the file could be big, my aim is to reduce memory 
usage when reading data from capnp database because it could be very big. 
When I read small portions of that database, I want my program not to 
consume so much memory. In the documentation, you refer to mmap usage to 
achieve this. Do you think that my approach is wrong for that purpose like 
I implemented in my code?

Thanks

On Wednesday, June 21, 2023 at 10:45:21 PM UTC+2 ken...@cloudflare.com 
wrote:

> Hi Adrian,
>
> How are you measuring memory usage, exactly?
>
> When using mmap, measuring memory usage gets a bit complicated. The kernel 
> will load pages of the file into memory when you read then, and then it is 
> free to discard those pages at any time -- because it can always load them 
> again later if needed. But the kernel will only actually discard pages if 
> it needs the memory for something else. So if you read the entire file by 
> mmap-ing it and reading every page, and nothing else needs memory, then all 
> those pages will stay resident in memory. But this isn't really the same as 
> your program allocating memory, because, again, all those pages can be 
> freed up instantly whenever memory is needed.
>
> In order to fully understand what is going on you may have to dig into 
> more detailed memory stats. If your OS is just giving you a single number 
> for memory usage, it isn't telling the full story. Usually you can find a 
> bunch of different statistics if you dig in a little more.
>
> -Kenton
>
> On Wed, Jun 21, 2023 at 9:47 AM Adrian <adriannr...@gmail.com> wrote:
>
>> Hello
>>
>> I have been working on Cap'n Proto for some time to make some tests. My 
>> aim is to read the small chunks in a big serialized data to reduce the 
>> total memory consumption. For that purpose, I used memory-mapped reading 
>> and wrote a simple example to make some memory usage tests. 
>>
>> In the tests, I realized that even if I only read the small data chunk 
>> (address) only include "address" string in itself, the total memory usage 
>> of the below test program is 512 MB in my machine (the capnp database is 
>> 2.1GB). I am wondering where I am doing something wrong. Note: I run the 
>> program only "read" mode. I called the "write" once to create capnp 
>> database.
>>
>> If you have any opinion, I would be very happy if you share it with me.
>>
>> *Proto file*
>>
>> ----------------------------------------------------------------------------------------------
>> @0xa5af5d9c9e54c04a;
>>
>> struct Person {
>>   name @0 :Text;
>>   id @1 :UInt32;
>>   email @2 :Text;
>>   address @3 :Text;
>> }
>>
>> struct AddressBook {
>>   people @0 :List(Person);
>> }
>>
>> ----------------------------------------------------------------------------------------------
>>
>> Source code of example
>>
>> ----------------------------------------------------------------------------------------------
>>
> #include "test.capnp.h"
>> #include <capnp/message.h>
>> #include <capnp/serialize-packed.h>
>> #include <capnp/serialize.h>
>> #include <iostream>
>> #include <fcntl.h>
>> #include <sys/mman.h>
>> #include <sys/stat.h>
>> #include <unistd.h>
>> #include <stdlib.h>
>>
>> void writeAddressBook(int fd)
>> {
>> constexpr const size_t NodeNumber = 1024 * 8;
>>
>> ::capnp::MallocMessageBuilder message;
>>
>> AddressBook::Builder addressBook = message.initRoot<AddressBook>();
>> ::capnp::List<Person>::Builder people = addressBook.initPeople(NodeNumber
>> );
>>
>> // Each string will be 128KB.
>> constexpr const size_t size = 1024 * 128;
>>
>> for (int i = 0; i < NodeNumber; i++)
>> {
>> Person::Builder person = people[i];
>> person.setId(i);
>> person.setName(std::string(size, 'A').c_str());
>> person.setEmail(std::string(size, 'A').c_str());
>> person.setAddress("Address");
>> }
>>
>> kj::VectorOutputStream output;
>> writeMessage(output, message);
>>
>> auto serializedData = output.getArray();
>>
>> void *dataPtr = const_cast<void *>(static_cast<const void *>(
>> serializedData.begin()));
>> size_t dataSize = serializedData.size();
>>
>> size_t totalBytesWritten = 0;
>> while (totalBytesWritten < dataSize)
>> {
>> auto numberOfBytesWritten = write(fd, static_cast<const char *>(dataPtr) 
>> + totalBytesWritten, dataSize - totalBytesWritten);
>> if (numberOfBytesWritten == -1)
>> {
>> throw std::runtime_error{"Error during creating capnp database"};
>> }
>> totalBytesWritten += numberOfBytesWritten;
>> }
>> }
>>
>> void readAddressBook(int fd)
>> {
>> struct stat st;
>> fstat(fd, &st);
>> size_t fileSize = st.st_size;
>>
>> char *mappedData = static_cast<char *>(mmap(nullptr, fileSize, PROT_READ, 
>> MAP_PRIVATE, fd, 0));
>>
>> capnp::FlatArrayMessageReader reader(kj::ArrayPtr<const capnp::word>(
>> reinterpret_cast<const capnp::word *>(mappedData), fileSize / sizeof(
>> capnp::word)));
>>
>> AddressBook::Reader addressBook = reader.getRoot<AddressBook>();
>>
>> for (Person::Reader person : addressBook.getPeople())
>> {
>>
> person.getId();
>>
> }
>>
>> munmap(mappedData, fileSize);
>> close(fd);
>> }
>>
>> int main(int argc, char **argv)
>> {
>> int fd = open("./data.bin", O_RDWR);
>>
>> if (!std::strcmp(argv[1], "--write"))
>> {
>> writeAddressBook(fd);
>> }
>>
>> if (!std::strcmp(argv[1], "--read"))
>> {
>> readAddressBook(fd);
>> }
>>
>> return 0;
>> }
>>
> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Cap'n Proto" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to capnproto+...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/capnproto/a3192b90-a8bf-4151-84e8-0b8516d8f71bn%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/capnproto/a3192b90-a8bf-4151-84e8-0b8516d8f71bn%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to capnproto+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/capnproto/e4783119-6a58-47d9-954b-5a5ba205b671n%40googlegroups.com.

Reply via email to