Hi again,

I just filed a bug report here:
http://bugzilla.scilab.org/show_bug.cgi?id=15809

Would it be possible to bring back the old mem-dump approach in scilab 6? I 
mean, could I write a gateway that just takes a pointer to the first byte in 
memory, figures out the size, and dumps to disk? Or maybe it doesn’t work like 
that. Writing a JSON exporter for storing filter coefficients in a math 
software package seems a bit ridicules, but hey, if it works it might be worth 
it in our case.

Cheers,
Arvid


From: users <users-boun...@lists.scilab.org> on behalf of Clément DAVID 
<clement.da...@scilab-enterprises.com>
Reply-To: Users mailing list for Scilab <users@lists.scilab.org>
Date: Monday, 15 October 2018 at 15:48
To: Users mailing list for Scilab <users@lists.scilab.org>
Cc: Clément David <clement.da...@esi-group.com>
Subject: Re: [Scilab-users] HDF5 save is super slow

Hello all,

Correct, I experienced such a slowness while working with Xcos diagrams for 
Scilab 5. At first we considered HDF5 for this deep nested list / mlist 
data-structure storage however after some tests ; XML might be used for 
tree-like storage and HDF5 (or Java types serialization) for big matrices.

AFAIK currently there is no easy way to load/save specifying a format other 
than HDF5 ; maybe adding xmlSave/xmlLoad sci_gateway to let the user select an 
xml file format for any Scilab structure might provide better performance on 
your use-case. JSON might also be another candidate to look at for decent 
serialization support.

PS: Scilab 5.5.1 load/save are direct memory dump so this is really the fastest 
you can get from Scilab ; HDF5 binary format is good enough for matrices

--
Clément

From: users <users-boun...@lists.scilab.org> On Behalf Of Stéphane Mottelet
Sent: Monday, October 15, 2018 2:36 PM
To: users@lists.scilab.org
Subject: Re: [Scilab-users] HDF5 save is super slow

Hello,

I looked a little bit in the sources: the evident bottleneck is the nested 
creation of an hdf5 group each time that a container variable is met.
For the given example, this is particularly evident. If you replace the syslin 
structure by the corresponding [A,B;C,D] matrix, then save is ten times faster:

N = 4;
n = 1000;
filters = list();
for i=1:n
  G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
  filters($+1) = G;
end
tic();
save('filters.dat', 'filters');
disp(toc());
--> disp(toc());

   0.724754

N = 4;
n = 1000;
filters = list()
for i=1:n
  G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
  filters($+1) = [G.a G.b;G.c G.d];
end
tic();
save('filters.dat', 'filters');
disp(toc());
--> disp(toc());

   0.082302

Serializing container objects seems to be the solution, but it goes towards an 
orthogonal direction w.r.t. the hdf5 portability spirit.

S.


Le 15/10/2018 à 12:22, Antoine Monmayrant a écrit :
Le 15/10/2018 à 11:55, Arvid Rosén a écrit :
Hi,

Thanks for getting back to me!

Unfortunately, we used Scilab’s pretty cool way of doing object orientation, so 
we have big nested tlist structures with multiple instances of various lists of 
filters and other structures, as in my example. Saving those structures in some 
explicit manual way would be extremely complicated. Or is there some way of 
writing explicit HDF5 saving/loading schemes using overloading? That would be 
great! I am sure we could find the main culprits and do something explicit for 
them, but as they can be located wherever in a big nested structure, it would 
be painful to do anything on the top level.

Another, related I guess, problem here is that the new file format uses about 
15 times as much disk space as the old format (for a typical ill-behaved nested 
structure). That adds to the save/load time too I guess, but is probably not 
the main source here.
Argh, yes, I tested it and in your example, I have a file x8.5 bigger.
I think that both increases in time and size are real issues and should be 
reported as bugs.

By the way, I rewrote your script to run it under both 6.0 and 5.5:

/////////////////////////////////
N = 4;
n = 10000;
filters = list();

for i=1:n
  G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
  filters($+1) = G;
end

ver=getversion('scilab');

if ver(1)<6 then
    tic();
    save('filters_old.dat', filters);
    ts1 = toc();
else
    tic();
    save('filters_new.dat', 'filters');
    ts1 = toc();
end

printf("Time for save %.2fs\n", ts1);
/////////////////////////////////

Hope it helps,

Antoine




I think I might have reported this earlier using Bugzilla, but I’m not sure. 
I’ll check and report it if not.

Cheers,
Arvid

From: users 
<users-boun...@lists.scilab.org><mailto:users-boun...@lists.scilab.org> on 
behalf of "amonm...@laas.fr"<mailto:amonm...@laas.fr> 
<amonm...@laas.fr><mailto:amonm...@laas.fr>
Reply-To: "antoine.monmayr...@laas.fr"<mailto:antoine.monmayr...@laas.fr> 
<antoine.monmayr...@laas.fr><mailto:antoine.monmayr...@laas.fr>, Users mailing 
list for Scilab <users@lists.scilab.org><mailto:users@lists.scilab.org>
Date: Monday, 15 October 2018 at 11:08
To: "users@lists.scilab.org"<mailto:users@lists.scilab.org> 
<users@lists.scilab.org><mailto:users@lists.scilab.org>
Subject: Re: [Scilab-users] HDF5 save is super slow

Hello,

I tried your code in 5.5.1 and the last nightly-build of 6.0: I see a slowdown 
of around 175 between old save in 5.5.1 and new (and only) save in 6.0.
It's really related to the data structure, because we use hdf5 read/write a lot 
here and did not experience significant slowdowns using 6.0.
I think the overhead might come to the translation of your fairly complex 
variable (a long array of tlist) in the corresponding hdf5 structure.
In the old save, this translation was not necessary.
Maybe you could try to save your data in a different way.
For example:
3) you could save each element of "filters" in a separate file.
2) you could bypass save and directly write your data in a hdf5 file by using 
h5open(), h5write() directly. It means you need to write your own load() for 
your custom file format. But this way, you can try to find the best way to 
layout your data in hdf5 format.
3) in addition to 2) you could try to save each entry of your "filters" array 
as one dataset in a given hdf5 file.

Did you search on bugzilla whether this bug was already submitted?
Could you try to report it?


Antoine

Le 15/10/2018 à 10:11, Arvid Rosén a écrit :
/////////////////////////////////
N = 4;
n = 10000;

filters = list();

for i=1:n
  G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
  filters($+1) = G;
end

tic();
save('filters.dat', filters);
ts1 = toc();

tic();
save('filters.dat', 'filters');
ts2 = toc();

printf("old save %.2fs\n", ts1);
printf("new save %.2fs\n", ts2);
printf("slowdown %.1f\n", ts2/ts1);
/////////////////////////////////



--

+++++++++++++++++++++++++++++++++++++++++++++++++++++++



 Antoine Monmayrant LAAS - CNRS

 7 avenue du Colonel Roche

 BP 54200

 31031 TOULOUSE Cedex 4

 FRANCE



 Tel:+33 5 61 33 64 59



 email : antoine.monmayr...@laas.fr<mailto:antoine.monmayr...@laas.fr>

 permanent email : 
antoine.monmayr...@polytechnique.org<mailto:antoine.monmayr...@polytechnique.org>



+++++++++++++++++++++++++++++++++++++++++++++++++++++++





--

+++++++++++++++++++++++++++++++++++++++++++++++++++++++



 Antoine Monmayrant LAAS - CNRS

 7 avenue du Colonel Roche

 BP 54200

 31031 TOULOUSE Cedex 4

 FRANCE



 Tel:+33 5 61 33 64 59



 email : antoine.monmayr...@laas.fr<mailto:antoine.monmayr...@laas.fr>

 permanent email : 
antoine.monmayr...@polytechnique.org<mailto:antoine.monmayr...@polytechnique.org>



+++++++++++++++++++++++++++++++++++++++++++++++++++++++







_______________________________________________

users mailing list

users@lists.scilab.org<mailto:users@lists.scilab.org>

https://antispam.utc.fr/proxy/1/c3RlcGhhbmUubW90dGVsZXRAdXRjLmZy/lists.scilab.org/mailman/listinfo/users



--

Stéphane Mottelet

Ingénieur de recherche

EA 4297 Transformations Intégrées de la Matière Renouvelable

Département Génie des Procédés Industriels

Sorbonne Universités - Université de Technologie de Compiègne

CS 60319, 60203 Compiègne cedex

Tel : +33(0)344234688

http://www.utc.fr/~mottelet<http://www.utc.fr/%7Emottelet>
_______________________________________________
users mailing list
users@lists.scilab.org
http://lists.scilab.org/mailman/listinfo/users

Reply via email to