Fwiw, I have been working on something similar. Others will have more experience on the GC, but perhaps you might find this interesting.

For CSV files, what I found is that parsing is quite slow (and memory intensive). So rather than parse the same data every time, I found it helpful to do so once in a batch that runs on a cron job, and write out to msgpack format.

I am not a GC expert, but what happens if you run GC.collect() once you are done parsing?

auto loadGiltPrices()
{
        auto data=cast(ubyte[])std.file.read("/hist/msgpack/dmo.pack");
        return cast(immutable)data.unpack!(GiltPriceFromDMO[][string]);
}

struct GiltPriceFromDMO
{
        string name;
        string ISIN;
        KPDateTime redemptionDate;
        KPDateTime closeDate;
        int indexLag;
        double cleanPrice;
        double dirtyPrice;
        double accrued;
        double yield;
        double modifiedDuration;
}

void main(string[] args)
{
        auto gilts=readCSVDMO();
        ubyte[] data=pack(gilts);
        std.file.write("dmo.pack",data);
        writefln("* done");
        data=cast(ubyte[])std.file.read("dmo.pack");
}

On Thursday, 16 April 2015 at 12:17:24 UTC, Adil wrote:
I've written a simple socket-server app that securities (stock
market shares) data and allows clients to query over them. The
app starts by loading instrument information from a CSV file into
some structs, then listens on a socket responding to queries. It
doesn't mutate the data or allocate anything substantial.

There are 2 main structs in the app. One stores security data,
and the other groups together securities. They are defined as
follows :

````
__gshared Securities securities;

struct Security
{
          string RIC;
          string TRBC;
          string[string] fields;
          double[string] doubles;

          @nogc @property pure size_t bytes()
          {
              size_t bytes;

              bytes = RIC.sizeof + RIC.length;
              bytes += TRBC.sizeof + TRBC.length;

              foreach(k,v; fields) {
                  bytes += (k.sizeof + k.length + v.sizeof +
v.length);
              }

              foreach(k, v; doubles) {
                  bytes += (k.sizeof + k.length + v.sizeof);
              }

              return bytes + Security.sizeof;
          }
}

struct Securities
{
          Security[] securities;
          private size_t[string] rics;

          // Store offsets for each TRBC group
          ulong[2][string] econSect;
          ulong[2][string] busSect;
          ulong[2][string] IndGrp;
          ulong[2][string] Ind;

          @nogc @property pure size_t bytes()
          {
              size_t bytes;

              foreach(Security s; securities) {
                  bytes += s.sizeof + s.bytes;
              }

              foreach(k, v; rics) {
                  bytes += k.sizeof + k.length + v.sizeof;
              }

              foreach(k, v; econSect) {
                  bytes += k.sizeof + k.length + v.sizeof;
              }

              foreach(k, v; busSect) {
                  bytes += k.sizeof + k.length + v.sizeof;
              }

              foreach(k, v; IndGrp) {
                  bytes += k.sizeof + k.length + v.sizeof;
              }

              foreach(k, v; Ind) {
                  bytes += k.sizeof + k.length + v.sizeof;
              }

              return bytes + Securities.sizeof;
          }
}
````

Calling Securities.bytes shows "188 MB", but "ps" shows about 591
MB of Resident memory. Where is the memory usage coming from?
What am i missing?

Reply via email to