Re: [Pharo-users] [Data Modeling] approaches to data persistence

sergio ruiz Tue, 14 Feb 2017 10:14:06 -0800

this is a GREAT answer.. a totally smalltalky answer!

i need to extend this test out to include randomly accessing a million records 
at a time…




On February 14, 2017 at 12:15:28 PM, Ben Coman (b...@openinworld.com) wrote:



On Wed, Feb 15, 2017 at 12:29 AM, Ben Coman <b...@openinworld.com> wrote:


On Tue, Feb 14, 2017 at 11:01 PM, sergio ruiz <sergio....@gmail.com> wrote:
Hey, all..

I have  been working on creating a REST interface using Teapot. In learning how 
to handle exceptions, I have been following along with the library example. 

One of the things i noticed was that, in the library example, they are modeling 
that data a little differently than i have been..

to persist a list of items (and easily retrieve them), i just gave the object 
an “id”, and store them on a class variable as an OrderedCollection..

in the library example, I see something i really like. rather than saving an 
ordered collection, they save it as a dictionary.

This dictionary goes { id -> object }.. this takes the id out of the the object 
(which i really like) and makes the id generation pretty much irrelevant..

my question.. is there any performance hit either way once this list grows to 
tens of thousands of records?



I was curious, so nothing better than to experiment...

myClass := Object subclass: #AA
instanceVariableNames: 'id data'
classVariableNames: ''
package: 'AAAA'.
myClass compile: 'id: i id:= i'.
myClass compile: 'data: d data:= d'.

N := 10 raisedTo: 7.
o := OrderedCollection new.
d := Dictionary new.
{ Time millisecondsToRun: [
1 to: N do: [:id| o add: (AA new id: id; data: 'blahblah')]].
Time millisecondsToRun: [
1 to: N do: [:id| d at: id put: (AA new data: 'blahblah')]].
} inspect.
o := nil.
d := nil.
Smalltalk garbageCollect.

N=5 ==> "#(5 42)"
N=6 ==> "#(434 839)" 
N=7 ==> "#(5733 17208)"

Slight modification to pre-allocate space to ignore dynamic growth cost...
o := OrderedCollection new: 2 * N.
d := Dictionary new:  2 * N.

N=5 ==> "#(7 33)"
N=6 ==> "#(411 802)"
N=7 ==> "#(5892 15141)"

cheers -ben

Lets also bench Arrays, and be a nicer with cleaning up memory...

N := 10 raisedTo: 7.
a := Array new: 2 * N.
atime := Smalltalk vm totalGCTime + (Time millisecondsToRun: [
1 to: N do: [:id| a at: id put: (AA new data: 'blahblah')]]) - Smalltalk vm 
totalGCTime.
a := nil. 
Smalltalk garbageCollect.

o := OrderedCollection new: 2 * N.
otime := Smalltalk vm totalGCTime + (Time millisecondsToRun: [
1 to: N do: [:id| o add: (AA new id: id; data: 'blahblah')]]) - Smalltalk vm 
totalGCTime.
o := nil. 
Smalltalk garbageCollect.

d := Dictionary new: 2 * N.
dtime := Smalltalk vm totalGCTime + (Time millisecondsToRun: [
1 to: N do: [:id| d at: id put: (AA new data: 'blahblah')]]) - Smalltalk vm 
totalGCTime.
d := nil. 
Smalltalk garbageCollect.

{atime. otime. dtime} inspect.

N=5 ==> "#(2 4 13)"  "#(2 4 13)"  "#(2 5 13)"
N=6 ==> "#(30 48 131)" "#(28 48 131)" "#(29 47 128)"
N=7 ==>  "#(274 470 1313)" "#(259 456 1340)" "#(269 467 1306)"

So insertions into Dictionaries are 
two to three  times slower than OrderedCollection, and 
five to six times slower than Arrays.

Now this is milliseconds, so even at the 100,000 level Dictionary performance 
may be a reasonable tradeoff for other benefits.

cheers -ben
----
peace,
sergio
photographer, journalist, visionary

Public Key: http://bit.ly/29z9fG0
#BitMessage BM-NBaswViL21xqgg9STRJjaJaUoyiNe2dV
http://www.Village-Buzz.com
http://www.ThoseOptimizeGuys.com
http://www.coffee-black.com
http://www.painlessfrugality.com
http://www.twitter.com/sergio_101
http://www.facebook.com/sergio101

signature.asc
Description: Message signed with OpenPGP using AMPGpg

Re: [Pharo-users] [Data Modeling] approaches to data persistence

Reply via email to