Let me start with a disclaimer and explanation. I wanted to have font metrics for a set of Unicode fonts available to some SML code. The metrics are extracted from the font by a cascade of programs and I already had code to inject them into C projects by creating files that went
  static font_info[] = {
    ... ... ... };
with MANY lines of simple static data. Doing things that way puts the metric data instantly available within my program. I modified things to create the same in SML and ended up with
  val metrics = Vector.from List [
    Vector.fromList [a,b,c,d,e,f],
    Vector.fromList [a,b,c,d,e,f],
    .. ];
with around 10000 entries in the top level list. This was simple to do and in particular simpler both now and for when/if the code gets used by people. In particular simpler than anything that has SML code that reads the metric data from a resource file. And I am only interested in the metrics I have right now and am not concerned about future expansion there.

Processing the input there works, but takes around 10 minutes of a reasonably fast desktop machine. That felt pretty bad.

If I change the code to go (as it were)
  [[x,x,x,x] @ [x,x,x,x] @ [x,x,x,x]]
at the top level with say 200 "x"s in each segment and 50 segments then the code is processed in more like 10 seconds rather than 10 minutes.

This feels as if it is probably a really easy case where some aspect of processing lists in the input has been coded with quadratic growth rate in a way that is probably not necessary (the fact that the version segmented using "@" gets handled faster reassures me of that).

A really easy demo if this is just
  val l = length [
    Vector.fromList [0],
    ... ];
with 10000 entries in the list.

I understand very happily that this is pathological code that nobody would ever write by hand and that as a way of handling data can be viewed as inflexible and crude. I am also happy that I can work round it using "@" and so it is not holding me up - but it felt as if it might be worth reporting and my expectation is that a very small change to the code would address it. I had looked and the code getList in SKIPS_.ML might have been the issue, but when I recode that to build the list backwards and then reverse at the end that does not unclog things, so the issue is at least one step deeper.

 Arthur
_______________________________________________
polyml mailing list
polyml@inf.ed.ac.uk
http://lists.inf.ed.ac.uk/mailman/listinfo/polyml

Reply via email to