Hi all,

I don't agree. Disk access is inherently slower than RAM access.

I think that this discussion started for Unidata and then got UniVerse involved too but it might have been the other way around. Sadly, there is no internals training material for Unidata so we have to guess what goes on.

Different multivalue products approach string management in varying ways. In UniVerse, strings are stored as contiguous memory. If I write a statement such as
  X<-1> = 'ABC'
this run machine has to work out how big the new string will be, allocate memory, copy the old value of X to the new area appending ABC to it, and then release the original memory used by X.

As you append successive fields, the string to be moved gets longer and longer. We tend to think of computers as being blindingly fast but copying a big string is still a slow process. If I have a string that starts empty and I add a million fields, each of 3 bytes plus the delimiter, I will end up copying a total of 1,999,998,000,000 bytes - hardly an insignificant task.

From my own experiments some time ago, I believe that Unidata also uses
contiguous strings but I have no direct proof of this. The alternative (adopted by our QM product, by PI/open, Information and perhaps others) is to use "chunked strings" where a string is stored as a series of chunks. In this model, appending a field requires only addition of a new chunk or, for better performance, replacement of the final chunk.

Of course, the performance gain of chunked strings in this example may be offset by their decreased performance for things like substring extraction which is now more complex than a simple indexing operation.

By way of a simple expample, I just tried the following program...
  s = ''
  z = str('*', 1000)
  t1 = time()
  for i = 1 to 100000
     s<-1> = z
  next i
  t2 = time()
  crt t2 - t1

This took six seconds on QM but 32 minutes on UniVerse. I do not have a Unidata system available at the moment to try. To be fair, I am sure that I could construct an example that reversed the performance difference.

Writing to a sequential file is somewhat similar to the chunked string model as it buffers data until it has a good sized chunk and then writes it out, continuing with an empty buffer.


Martin Phillips
Ladybridge Systems Ltd
17b Coldstream Lane, Hardingstone, Northampton, NN4 6DB
+44-(0)1604-709200
_______________________________________________
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

Reply via email to