On 2014-11-14 16:44, Jørgen Kvalsvik wrote:
On 14. nov. 2014 16:33, Bård Skaflestad wrote:
As a general rule I usually distrust std::map<> when it comes to
handling anything that has to do with computational performance
and inner loops.
[ Reason elided ]
I changed it to std::map as a test, and with what I had it
out-performed the previous std::vector-of-triples solution.
I don't doubt that std::map<> was faster than the original version that
std::sort()ed a std::vector<> of triplets. If I'm guessing, however,
I'd attribute the speedup more to a non-optimal choice of initial data
structure than std::map<> being particularly good.
If I were implementing the SparseMatrixBuilder<> from scratch I'd use
something a lot more targeted to the actual application (conversion from
coordinate to CSR form). Like I said, you don't actually need to use a
general sorting procedure at all (complexity O(N log N) in number of
non-zero elements N). You can achieve a linear bucket sort directly by
transposing the structure twice--albeit at the cost of temporarily
duplicating the storage arrays. This is what all commercial packages do
(or at least some variations of the basic approach). As a side benefit
this scheme allows easy compression of duplicate entries using a
skipping/summing rule.
There are further refinements possible too, such as O(1) rather than
O(log (number of row elements)) insertion if you guarantee that you
always visit the matrix elements in the same order every time you form
the coefficient matrix.
If you want to, I can outline this in code but you should really have a
look at, say, Timothy Davis' "Direct Methods for Sparse Linear Systems"
for some background material.
Is Dune::BCRSMatrix internal storage dependent on construction type?
It certainly used to be. I last checked the implementation in Dune
2.2.x. At that time the row-creation iterator allocated a separate
value container for each row. If you knew the total number of non-zero
elements at object construction time it used a single array for the
column indices though. On the other hand the random mode, which does
put quite a bit of responsibility on the user, ends up using just a
three contiguous arrays internally--one for the row pointers, one for
the column indices and one for the values themselves (an array of block
types).
Bård
_______________________________________________
Opm mailing list
Opm@opm-project.org
http://www.opm-project.org/mailman/listinfo/opm