Fwiw, I have been working on something similar. Others will have
more experience on the GC, but perhaps you might find this
interesting.
For CSV files, what I found is that parsing is quite slow (and
memory intensive). So rather than parse the same data every
time, I found it helpful to do so once in a batch that runs on a
cron job, and write out to msgpack format.
I am not a GC expert, but what happens if you run GC.collect()
once you are done parsing?
auto loadGiltPrices()
{
auto data=cast(ubyte[])std.file.read("/hist/msgpack/dmo.pack");
return cast(immutable)data.unpack!(GiltPriceFromDMO[][string]);
}
struct GiltPriceFromDMO
{
string name;
string ISIN;
KPDateTime redemptionDate;
KPDateTime closeDate;
int indexLag;
double cleanPrice;
double dirtyPrice;
double accrued;
double yield;
double modifiedDuration;
}
void main(string[] args)
{
auto gilts=readCSVDMO();
ubyte[] data=pack(gilts);
std.file.write("dmo.pack",data);
writefln("* done");
data=cast(ubyte[])std.file.read("dmo.pack");
}
On Thursday, 16 April 2015 at 12:17:24 UTC, Adil wrote:
I've written a simple socket-server app that securities (stock
market shares) data and allows clients to query over them. The
app starts by loading instrument information from a CSV file
into
some structs, then listens on a socket responding to queries. It
doesn't mutate the data or allocate anything substantial.
There are 2 main structs in the app. One stores security data,
and the other groups together securities. They are defined as
follows :
````
__gshared Securities securities;
struct Security
{
string RIC;
string TRBC;
string[string] fields;
double[string] doubles;
@nogc @property pure size_t bytes()
{
size_t bytes;
bytes = RIC.sizeof + RIC.length;
bytes += TRBC.sizeof + TRBC.length;
foreach(k,v; fields) {
bytes += (k.sizeof + k.length + v.sizeof +
v.length);
}
foreach(k, v; doubles) {
bytes += (k.sizeof + k.length + v.sizeof);
}
return bytes + Security.sizeof;
}
}
struct Securities
{
Security[] securities;
private size_t[string] rics;
// Store offsets for each TRBC group
ulong[2][string] econSect;
ulong[2][string] busSect;
ulong[2][string] IndGrp;
ulong[2][string] Ind;
@nogc @property pure size_t bytes()
{
size_t bytes;
foreach(Security s; securities) {
bytes += s.sizeof + s.bytes;
}
foreach(k, v; rics) {
bytes += k.sizeof + k.length + v.sizeof;
}
foreach(k, v; econSect) {
bytes += k.sizeof + k.length + v.sizeof;
}
foreach(k, v; busSect) {
bytes += k.sizeof + k.length + v.sizeof;
}
foreach(k, v; IndGrp) {
bytes += k.sizeof + k.length + v.sizeof;
}
foreach(k, v; Ind) {
bytes += k.sizeof + k.length + v.sizeof;
}
return bytes + Securities.sizeof;
}
}
````
Calling Securities.bytes shows "188 MB", but "ps" shows about
591
MB of Resident memory. Where is the memory usage coming from?
What am i missing?