Re: Arrays vs Object for Key/Value pair lookups
Hey John! Glad to see you're still digging into details. C_OBJECT is great, I use it all the time. It's always nice to have more choices in 4D. (You know more languages than most people, so you doubtlessly were already missing something like C_OBJECT.) I posted speed numbers a few days ago with some code that anyone can test for themselves and how I interpret them. For me, I'll use C_OBJECT when it's easier and sorted arrays when they're easier. The speed difference between access into sorted arrays and C_OBJECT is trivial to the point of irrelevance. (Arrays are quicker to set up, C_OBJECT may be a tiny bit faster for lookups. But both of those things "depend".) I like the example you provided and agree it's a perfect case for C_OBJECT. Web input management is a problem that cries out for a simple lookup structure. One nice thing about C_OBJECT over sorted arrays is that you don't have to worry about sort order. One nice thing about arrays is that the access syntax is better than some of the available C_OBJECT access options. As an example of when I find arrays easier, I've got one. Well, I've got lots, but I've got an interesting one. I like to keep an error stack. This way I can run a bunch of lines of code to validate inputs, etc. as a block and then check the number of errors after the block. No errors? Carry on! 1+ error(s), deal with it. Sometimes I want to dump everything, most of the time all I want is a count or to get the most recent error details. Now this has nothing to do with arrays and associative arrays per se, but in 4D arrays inside of objects are a bit of a pain. If I want to count the number of items I either need to stash a counter in the object and pull it out, or pull the whole array out and get it's size. If I want the last item, I have to pull the whole array out and get the item out of the array. So I keep the data in an array to make counts and item access simple. But what about a full dump of the whole stack? It's nice to have it all in one object, so I embed the array into an object. (It's an array of objects, as it turns out.) I've got some methods around all of this data so the rest of the world doesn't have to deal with the issues. So, given the current state of play, I'm using C_OBJECT, ARRAY OBJECT, and sorted arrays. I'm not using object fields and don't think that I've got any cases for them in 4D. If I really need to store JSON closer to an external application, I can pipe it over to Postgres in a 'text' field, which compresses nicely. (Postgres 'json' and 'jsonb' fields don't sound like they compress well at all. That makes sense to me for jsonb but not for json. But what do I know?) ** 4D Internet Users Group (4D iNUG) FAQ: http://lists.4d.com/faqnug.html Archive: http://lists.4d.com/archives.html Options: http://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4d_tech-unsubscr...@lists.4d.com **
Re: Arrays vs Object for Key/Value pair lookups
I have been loosely following this thread and thought I would share my first use of C_OBJECT. Many years ago I inherited a 4D database that David Adams, yes you David, had done some work on. Specifically he had crated a web interface for it and used much of what he talked about in one of his early books on the subject of 4D and the web. David used a set of routines to parse both incoming URL parameters name value pairs, as well as form field name value pairs into parallel arrays. I just recently modified his code to instead parse them into C_Objects. I did this as I was about to implement a very large complex query to respond to requests for records from a table of 50,000 records with up to 30 possible filters or search parameters involving multiple related tables. I must say that using C_Objects for the query made building the query much easier and with fewer lines of code. In the web app the filters are loaded into Dictionaries, which I combine into a single dictionary which is sent as a form in the request to 4D. Parsed into a C_Object… QUERY([Project];[Project]ID>0;*) If (OB Get(oFormFieldPrameters;"pfTitle")#"") QUERY([Project]; & ;[Project]Project_Title=OB Get(oFormFieldPrameters;"pfTitle";*)) End if If (OB Get(oFormFieldPrameters;"pfConsultants")#"") QUERY([Project]; & ;[Project]Consultants=OB Get(oFormFieldPrameters;"pfConsultants";*)) End if //. etc. 30 times QUERY([Project]) Could of done it with the parallel arrays, but this is far more elegant and perhaps even more efficient. John > On Jul 19, 2017, at 11:39 PM, JPR via 4D_Tech <4d_tech@lists.4d.com> wrote: > > > > [JPR] > > Hi Guys, > > The exact thing that I've explained was how to use objects to get Associative > Arrays in 4D. Associative Arrays are widely used in other languages like PHP > or JavaScript. > > In computer science, an Associative Array is an abstract data type composed > of a collection of (key, value) pairs, such that each possible key appears > just once in the collection. In 4D, the JSON-type Objects are perfect for > Associative Arrays. > > In many case, you will have 2 parallel arrays, let's say one for the product > code ($arCodes), and one for the product name ($arNames). You want to find > the name for a specific code (classic 4D way) > > - You create the array with a loop, adding elements pairs $myCode and $myName: > > $k:=Find in array($arCodes;$myCode) > If ($k>0) > $arNames{$k}:=$myName > Else > APPEND TO ARRAY($arCodes;$myCode) > APPEND TO ARRAY($arNames;$myName) > End if > > - Then you can find a name from a code: > > $k:=Find in array($arCodes;$myCode) > If ($k>0) > $myName:=$arNames{$k} > Else > $myName:="" > End if > > Now if you use an Object: > > C_OBJECT($myArray) > > You create the array with a loop: > > OB SET($myArray;$myCode;$myName) > > -and you find with: > > $myName:=OB Get($myArray;$myCode} > > ...much simpler, and much faster! Why is it faster? With a classic array, the > Find in array command has to parse the entire array, element per element, > until the correct element is found (Except in case of a Sorted array, but > sometimes you can't sort the array because the index of an element can be > meaningful for your method) > > In case of an object, the properties are 'indexed' by using an internal Hash > table, so the access to one particular Property doesn't need a sequential > parsing of the list of values, but an almost direct access. I confirm what > Justin says, that is to say that the bigger will be the array, the more > efficient will be associative arrays compared with classic parsing of arrays. > > My very best, > > JPR > > > >> Message: 7 >> Date: Mon, 17 Jul 2017 11:43:12 -0700 >> From: Justin Leavens >> To: 4D iNug Technical <4d_tech@lists.4d.com> >> Subject: Re: Arrays vs Object for Key/Value pair lookups >> Message-ID: >> >> Content-Type: text/plain; charset="UTF-8" >> >> I did a 2014 Summit presentation (5 JSON Tips) which should be available >> for download that demonstrated the benefits of using objects for key/value >> pair cache lookups, but in the end it’s pretty easy to demonstrate. The >> benefits start to show up with a few hundred keys, but at 100,000 it’s >> easily 20x faster looking up object keys as opposed to find in array in >> interpreted. And when you compile, it’s literally hundred of times faster >> (400-500x) at 100k keys - and the benefits just get bigger and bigger with >> more
Re: Arrays vs Object for Key/Value pair lookups
On 20 Jul 2017, at 11:39, JPR via 4D_Tech <4d_tech@lists.4d.com> wrote: > In case of an object, the properties are 'indexed' by using an internal Hash > table, so the access to one particular Property doesn't need a sequential > parsing of the list of values, but an almost direct access. I confirm what > Justin says, that is to say that the bigger will be the array, the more > efficient will be associative arrays compared with classic parsing of arrays. Many thanks for refreshing your advice JPR ! Very useful. Regards Peter ** 4D Internet Users Group (4D iNUG) FAQ: http://lists.4d.com/faqnug.html Archive: http://lists.4d.com/archives.html Options: http://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4d_tech-unsubscr...@lists.4d.com **
Re: Arrays vs Object for Key/Value pair lookups
[JPR] Hi Guys, The exact thing that I've explained was how to use objects to get Associative Arrays in 4D. Associative Arrays are widely used in other languages like PHP or JavaScript. In computer science, an Associative Array is an abstract data type composed of a collection of (key, value) pairs, such that each possible key appears just once in the collection. In 4D, the JSON-type Objects are perfect for Associative Arrays. In many case, you will have 2 parallel arrays, let's say one for the product code ($arCodes), and one for the product name ($arNames). You want to find the name for a specific code (classic 4D way) - You create the array with a loop, adding elements pairs $myCode and $myName: $k:=Find in array($arCodes;$myCode) If ($k>0) $arNames{$k}:=$myName Else APPEND TO ARRAY($arCodes;$myCode) APPEND TO ARRAY($arNames;$myName) End if - Then you can find a name from a code: $k:=Find in array($arCodes;$myCode) If ($k>0) $myName:=$arNames{$k} Else $myName:="" End if Now if you use an Object: C_OBJECT($myArray) You create the array with a loop: OB SET($myArray;$myCode;$myName) -and you find with: $myName:=OB Get($myArray;$myCode} ...much simpler, and much faster! Why is it faster? With a classic array, the Find in array command has to parse the entire array, element per element, until the correct element is found (Except in case of a Sorted array, but sometimes you can't sort the array because the index of an element can be meaningful for your method) In case of an object, the properties are 'indexed' by using an internal Hash table, so the access to one particular Property doesn't need a sequential parsing of the list of values, but an almost direct access. I confirm what Justin says, that is to say that the bigger will be the array, the more efficient will be associative arrays compared with classic parsing of arrays. My very best, JPR > Message: 7 > Date: Mon, 17 Jul 2017 11:43:12 -0700 > From: Justin Leavens > To: 4D iNug Technical <4d_tech@lists.4d.com> > Subject: Re: Arrays vs Object for Key/Value pair lookups > Message-ID: > > Content-Type: text/plain; charset="UTF-8" > > I did a 2014 Summit presentation (5 JSON Tips) which should be available > for download that demonstrated the benefits of using objects for key/value > pair cache lookups, but in the end it’s pretty easy to demonstrate. The > benefits start to show up with a few hundred keys, but at 100,000 it’s > easily 20x faster looking up object keys as opposed to find in array in > interpreted. And when you compile, it’s literally hundred of times faster > (400-500x) at 100k keys - and the benefits just get bigger and bigger with > more keys. That’s both for filling the cache and for retrieving values > (objects save you from having to check if a key is already in the array > before adding it). > > -- > Justin Leavens > jus...@jitbusiness.com (818) 986-7298 x 701 > Just In Time Consulting, Inc. > Custom software for unique businesses > http://www.linkedin.com/in/justinleavens > > On July 17, 2017 at 3:46:26 AM, Peter Jakobsson via 4D_Tech ( > 4d_tech@lists.4d.com) wrote: > > Hi > > I remember at last year’s summit, JPR was emphasising how objects were far > more optimised than arrays for doing lookups over large numbers of key > value pairs. > > e.g. we usually do this: > > $x:=find in array(myKEYS;”product_code_x”) > > if($x>0) > $0:=myPRICES{$x} > end if > > How do people prefer to do this with objects ? Enumerate the keys in some > systematic way and then populate the object like this > > > For($i;1;$SIZE) > > $key:=string($i) > $value:=myarrayVAL{$i} > OB SET($object;$key;$value) > > End For > > Then for retreiving: > > $key:=string($1) > > $0:=OB Get($object;$key) > > …or was JPR suggesting we use object arrays and do some kind of “find” over > the object arrays ? > > Best Regards > > Peter > ** 4D Internet Users Group (4D iNUG) FAQ: http://lists.4d.com/faqnug.html Archive: http://lists.4d.com/archives.html Options: http://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4d_tech-unsubscr...@lists.4d.com **
Re: Arrays vs Object for Key/Value pair lookups
Case sensitive comparison support in Find in array/Find in sorted array is long overdue. Even best, supported in string compariosn operator (or new operator for string) $true:=($string1=*$string2) $true:=($string1>*$string2) $true:=($string1<*$string2) Alan Chan 4D iNug Technical <4d_tech@lists.4d.com> writes: >the "Find" commands accept wild cards and evaluate using collation algorithms >(case-insensitive comparison plus some other locale specific rules) >is it really fair to compare the two against object keys? ** 4D Internet Users Group (4D iNUG) FAQ: http://lists.4d.com/faqnug.html Archive: http://lists.4d.com/archives.html Options: http://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4d_tech-unsubscr...@lists.4d.com **
Re: Arrays vs Object for Key/Value pair lookups
> "Find" commands accept wild cards and evaluate using collation algorithms (case-insensitive comparison plus some > other locale specific rules) is it really fair to compare the two against object keys? I'm not sure what "fair" means here, but it's definitely not a apples-to-apples comparison. Find in array and Find in sorted array are variants on the same thing, so they're easy to compare. Object keys? No idea. As far as I can tell, 4D refused to offer any information about how the work that the company will stand behind publicly. Thats okay, I'm used to black box testing and I like it...but it's time-consuming. The point of these comparisons isn't to figure out if one approach is "better" than another so much as how they work *in the real world.* Put another way, the goal is to come up with some rules of thumb about what to use when. Binary search kicks ass, and I know why. Object key lookups kick ass, and I don't know why. My take-away is to * Use objects when when they're easier or more appropriate for the problem at hand. * Use sorted arrays when when they're easier or more appropriate for the problem at hand. * Don't shift from arrays to objects based on a notion that they're "faster." * Consider objects instead of arrays if you don't have or can't be sure of a sorted array order because object key lookups are way faster than sequential array traversal (Find in array.) * Don't worry about speed at all unless you've got a solid reason to. Thinking best on my tests, a few points for anyone that wants to tweak them: -- If you want sequential searches to look better, just search for the first items. Search time should be directly related to the position of the target in the array. I avoided this trap on purpose. -- I used very small text values for lookups and keys! Long strings might behave differently, I don't know. I would actually find that an interesting result, if anyone feels like checking. -- The object keys are inserted into the test object in sorted order. This should not make any difference if there's a hash underneath, but we don't actually know that. Although it does see likely. From the few results I've gotten, I'd wildly guess that there's: -- An excellent hash function where "excellent" means "low collision, high dispersal and fast." -- A secondary structure off the hash bins that is itself smart. So, not a linked list (The CS 101 approach), but a second good hash or a tree of some kind. Or something else. -- A pretty large range of hash slots to reduce secondary lookup times. -- Probably some smart scheme for changing the hash table size dynamically under stress. That's an expensive maneuver (or normally is, I can think of ways to make it not too expensive.) Just speculating, I'm probably wrong in every detail here. Doesn't matter. It's a black box. ** 4D Internet Users Group (4D iNUG) FAQ: http://lists.4d.com/faqnug.html Archive: http://lists.4d.com/archives.html Options: http://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4d_tech-unsubscr...@lists.4d.com **
Re: Arrays vs Object for Key/Value pair lookups
the "Find" commands accept wild cards and evaluate using collation algorithms (case-insensitive comparison plus some other locale specific rules) is it really fair to compare the two against object keys? > 2017/07/18 9:44、David Adams via 4D_Tech <4d_tech@lists.4d.com> のメール: > > * Sequential Find in array > * Binary Find in sorted array > * Object lookup ** 4D Internet Users Group (4D iNUG) FAQ: http://lists.4d.com/faqnug.html Archive: http://lists.4d.com/archives.html Options: http://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4d_tech-unsubscr...@lists.4d.com **
Re: Arrays vs Object for Key/Value pair lookups
Hello all, I've been interested in this topic for some time but have never taken the time to run any tests. I don't have the time now (for sure), but I took some anyway. What I did was grab a huge unique word file, clear out words that are obviously illegal JSON key names and tried doing lookups three ways: * Sequential Find in array * Binary Find in sorted array * Object lookup For reference, here's a link to the original file full of words: https://github.com/dwyl/english-words/blob/master/words.zip I tried not to bias the tests as what I'd like are useful results. Still, I didn't test a whole lot of different ways and bias is nearly impossible to avoid. Even if my test is totally fair, there's no way that it's complete - results always depend on the data under test. This is part of why algorithms are described using big O terminology. You have a way of talking about the performance envelope around the algorithm under various sorts of conditions. (That's a terrible description, but so be it.) As an example, it's very easy to compare s sequential find in array with a binary search. There are only a couple of cases where sequential is faster, no matter the size of the array. With 4D's object lookups, we just don't know. Even if they are a hash table (likely but not confirmed), this doesn't tell us much. (Hash tables have a whole lot of components in their implementation, some of which can behave in weird ways, depending on your data set+hashing function. It also matters what you use to find actual values, not just hash bins.) Anyway, here are a set of results in a compiled system with ~465,000 words/keys: Words: 466,474 Tests: 10,000 Sequential: 107,777 Binary: 153 Object: 9 The three times are in milliseconds. As in, "Searching for 10,000 different words in an array of 466,474 unique words took a little over 1/10th of a second using a sorted array." That's roughly 1/2 of a blink. (Not kidding.) Comments and take-aways: * Binary search is great. * Object search is great. * Sequential search is not so great, but it still only took about 11 seconds. * I noticed that setting up the sorted array took no time and that setting up the object took time that I could feel. I didn't do timing results on this. But if it's true, the *overall* time (including setup) for the object was *unfavorable.* Conclusion: I'll use objects when I need them and sorted arrays when I need them. The performance difference is too small to be a factor, it will come down to other properties of these data structures. ++ To Justin on the whole 'use the lookup value as the key' tip (Rob has mentioned this too.) I sue that all of the time in objects, it's a really excellent practice. If anyone wants to re-run the tests or check my code for logic errors, dumb errors, bias, etc., here's the code with comments: If (False) // https://github.com/dwyl/english-words/blob/master/words.zip // Imported into a new table in 4D. // Cleared ones starting with numbers or punctuation. // 466,475 words left. End if // // Setup // ALL RECORDS([word]) ORDER BY([word];[word]word) // 4D indexed sort ARRAY TEXT($words_at;0) SELECTION TO ARRAY([word]word;$words_at) // Sorted arraa of 464K+ words C_OBJECT($words_object) C_LONGINT($words_count) C_LONGINT($word_index) $words_object:=JSON Parse("{}") $words_count:=Size of array($words_at) For ($word_index;1;$words_count) C_TEXT($word) $word:=$words_at{$word_index} // {"hello":"HELLO"} - no reason for the lower/upper other than to make it read in the Debugger. OB SET($words_object;Lowercase($word);Uppercase($word)) End for // Let's build an array of random words from the main array of words. C_LONGINT($test_words_count) $test_words_count:=1 // Note: Cannot be larger than the $words_count // Hmmm. Not getting a good distribution of indexes from Random. // Instead, I'll pick words from different positions along the array. ARRAY TEXT($test_words_at;$test_words_count) $test_words_at{1}:=$words_at{1} // Best case for a sequential scan $test_words_at{2}:=$words_at{$words_count} // Worst case for a sequential scan C_LONGINT($interval) // Now we want to fill in the rest of the test array. // The selected words are grabbed from even intervals along the array. // The speed difference for a sequential search should be linear. // The speed difference for a binary search should be very small amongst words. // It should take up to about ~18 reads to find the word. // The speed difference for the object? No clue, we don't know how they're implemented. // With a fast hash and a large hash table, it could be very quick. Hard to say. $interval:=$words_count\$test_words_count C_LONGINT($test_word_index) For ($test_word_index;3;$test_words_count) // Start at 3 because we just filled in 1 & 2 by hand. C_LONGINT($word_index) $word_index:=$interval*
Re: Arrays vs Object for Key/Value pair lookups
I did a 2014 Summit presentation (5 JSON Tips) which should be available for download that demonstrated the benefits of using objects for key/value pair cache lookups, but in the end it’s pretty easy to demonstrate. The benefits start to show up with a few hundred keys, but at 100,000 it’s easily 20x faster looking up object keys as opposed to find in array in interpreted. And when you compile, it’s literally hundred of times faster (400-500x) at 100k keys - and the benefits just get bigger and bigger with more keys. That’s both for filling the cache and for retrieving values (objects save you from having to check if a key is already in the array before adding it). -- Justin Leavens jus...@jitbusiness.com (818) 986-7298 x 701 Just In Time Consulting, Inc. Custom software for unique businesses http://www.linkedin.com/in/justinleavens On July 17, 2017 at 3:46:26 AM, Peter Jakobsson via 4D_Tech ( 4d_tech@lists.4d.com) wrote: Hi I remember at last year’s summit, JPR was emphasising how objects were far more optimised than arrays for doing lookups over large numbers of key value pairs. e.g. we usually do this: $x:=find in array(myKEYS;”product_code_x”) if($x>0) $0:=myPRICES{$x} end if How do people prefer to do this with objects ? Enumerate the keys in some systematic way and then populate the object like this > For($i;1;$SIZE) $key:=string($i) $value:=myarrayVAL{$i} OB SET($object;$key;$value) End For Then for retreiving: $key:=string($1) $0:=OB Get($object;$key) …or was JPR suggesting we use object arrays and do some kind of “find” over the object arrays ? Best Regards Peter ** 4D Internet Users Group (4D iNUG) FAQ: http://lists.4d.com/faqnug.html Archive: http://lists.4d.com/archives.html Options: http://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4d_tech-unsubscr...@lists.4d.com ** ** 4D Internet Users Group (4D iNUG) FAQ: http://lists.4d.com/faqnug.html Archive: http://lists.4d.com/archives.html Options: http://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4d_tech-unsubscr...@lists.4d.com **
Re: Arrays vs Object for Key/Value pair lookups
On 17 Jul 2017, at 17:03, Herr Alexander Heintz via 4D_Tech <4d_tech@lists.4d.com> wrote: > so I queried for the language I needed and then > apply to selection([dict];ob set(<>Dict;[dict]WordKey;[dict]Word) Ah ! So you just ‘hoover up’ into your dictionary object. Like a hoover ? Peter ** 4D Internet Users Group (4D iNUG) FAQ: http://lists.4d.com/faqnug.html Archive: http://lists.4d.com/archives.html Options: http://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4d_tech-unsubscr...@lists.4d.com **
Re: Arrays vs Object for Key/Value pair lookups
That’s basically it. only I don not need the wrapper anymore, i go directly to word:=OB Get(<>Dict;$t_MyKey;is Text) Using arrays I sorted the key array and used my own optimized array query routine (same as the new Find in sorted array introduced in V16). With object, no need to sort, the object system optimizes it by itself. Only no need to calculate as my dictionary table is quite simple: WordKey Language Word so I queried for the language I needed and then apply to selection([dict];ob set(<>Dict;[dict]WordKey;[dict]Word) ready set go could not be conceivably easier cheers > Am 17.07.2017 um 16:45 schrieb Peter Jakobsson via 4D_Tech > <4d_tech@lists.4d.com>: > > Thanks Alexander. > > Which style of implementation did you use ? Did you use the old array lookup > key as the new object key in the key/value pair ? i.e. did you enumerate the > keys like this: ? > > === OLD WAY === > > ARRAY LONGINT(vArrKeysID; 1000) > ARRAY LONGINT(vArrKeysNames; 1000) > > $x:=Find in Array(vArrKeysID;345) > > If($x>0) > $0:= vArrKeysNames{$x} > End if > > === NEW WAY === > > C_OBJECT($myOBJECT) > > For($i;1;1000) > > $key:=String($i) > $value:=$i > OB SET($myOBJECT;$key;$value) > > End For > > …then for finding (passing the ID in $1: > > $key:=string($1) > > $0:=ob get($myOBJECT;$key) > > == > > Is that how you did it ? (i.e. with calculated/hashed keys). > > Peter > > > On 17 Jul 2017, at 13:17, Herr Alexander Heintz via 4D_Tech > <4d_tech@lists.4d.com> wrote: > >> Using objects was MAGNITUDES faster than synchronised arrays > > ** > 4D Internet Users Group (4D iNUG) > FAQ: http://lists.4d.com/faqnug.html > Archive: http://lists.4d.com/archives.html > Options: http://lists.4d.com/mailman/options/4d_tech > Unsub: mailto:4d_tech-unsubscr...@lists.4d.com > ** ** 4D Internet Users Group (4D iNUG) FAQ: http://lists.4d.com/faqnug.html Archive: http://lists.4d.com/archives.html Options: http://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4d_tech-unsubscr...@lists.4d.com **
Re: Arrays vs Object for Key/Value pair lookups
Thanks Alexander. Which style of implementation did you use ? Did you use the old array lookup key as the new object key in the key/value pair ? i.e. did you enumerate the keys like this: ? === OLD WAY === ARRAY LONGINT(vArrKeysID; 1000) ARRAY LONGINT(vArrKeysNames; 1000) $x:=Find in Array(vArrKeysID;345) If($x>0) $0:= vArrKeysNames{$x} End if === NEW WAY === C_OBJECT($myOBJECT) For($i;1;1000) $key:=String($i) $value:=$i OB SET($myOBJECT;$key;$value) End For …then for finding (passing the ID in $1: $key:=string($1) $0:=ob get($myOBJECT;$key) == Is that how you did it ? (i.e. with calculated/hashed keys). Peter On 17 Jul 2017, at 13:17, Herr Alexander Heintz via 4D_Tech <4d_tech@lists.4d.com> wrote: > Using objects was MAGNITUDES faster than synchronised arrays ** 4D Internet Users Group (4D iNUG) FAQ: http://lists.4d.com/faqnug.html Archive: http://lists.4d.com/archives.html Options: http://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4d_tech-unsubscr...@lists.4d.com **
Re: Arrays vs Object for Key/Value pair lookups
Take a look at the new http://livedoc.4d.com/4D-Language-Reference-15.4/Arrays/Find-in-sorted-array.301-3274895.en.html. This would change the FIA side of the equation. Keith - CDI > On Jul 17, 2017, at 5:46 AM, Peter Jakobsson via 4D_Tech > <4d_tech@lists.4d.com> wrote: > > Hi > > I remember at last year’s summit, JPR was emphasising how objects were far > more optimised than arrays for doing lookups over large numbers of key value > pairs. > > e.g. we usually do this: > > $x:=find in array(myKEYS;”product_code_x”) > > if($x>0) > $0:=myPRICES{$x} > end if > > How do people prefer to do this with objects ? Enumerate the keys in some > systematic way and then populate the object like this > > > For($i;1;$SIZE) > > $key:=string($i) > $value:=myarrayVAL{$i} > OB SET($object;$key;$value) > > End For > > Then for retreiving: > > $key:=string($1) > > $0:=OB Get($object;$key) > > …or was JPR suggesting we use object arrays and do some kind of “find” over > the object arrays ? > > Best Regards > > Peter > > ** > 4D Internet Users Group (4D iNUG) > FAQ: http://lists.4d.com/faqnug.html > Archive: http://lists.4d.com/archives.html > Options: http://lists.4d.com/mailman/options/4d_tech > Unsub: mailto:4d_tech-unsubscr...@lists.4d.com > ** ** 4D Internet Users Group (4D iNUG) FAQ: http://lists.4d.com/faqnug.html Archive: http://lists.4d.com/archives.html Options: http://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4d_tech-unsubscr...@lists.4d.com **
Re: Arrays vs Object for Key/Value pair lookups
Interesting subject. Just to make sure I am able to interpret the the findings correctly, were you comparing Find in array with object find, or *binary* find in array on a sorted array? Binary search is very hard (but certainly not impossible) to compete with. If you have an array of 1,000,000 elements, it takes something like 36 operations max to find a value. If you have an unsorted array of 1,000,000 items then it can take 1,000,000 comparisons to check for a value. Kind of a big deal. A naive, sequential Find in array and a smart binary search on a sorted array are *very* different animals. Conflating the two makes search results based on one meaningless. That's why I'm trying to sort out which of these animals you were comparing with searches on objects. Object may be using some kind of hash table which, for sure, ought to beat a sequential find in array. We don't know how many buckets are in the hash table, but say that it's 4,096. You cut your initial search space down to roughly 256 values. (This could be 0 values or it could be 1,000 - it depends on the data and the hashing function.) That gives you a *massive* optimization very inexpensively. It's still hard to beat a binary search under a normal distribution of values and searches, but it's still way faster than a sequential search. Then again, we don't actually know *anything* about the way object searches work in 4D so anything is possible. 4D won't say anything on the subject for reasons they will not discuss. I find this completely puzzling, but there it is. ** 4D Internet Users Group (4D iNUG) FAQ: http://lists.4d.com/faqnug.html Archive: http://lists.4d.com/archives.html Options: http://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4d_tech-unsubscr...@lists.4d.com **
Re: Arrays vs Object for Key/Value pair lookups
I did a lot of testing for this as I need to keep a dictionary of words identified by word IDs with some 300 000 items around. I need to retrieve the words based on their ID. Using objects was MAGNITUDES faster than synchronised arrays (Cannot find the number anymore but we are talking measurable differences here, 1ms to several hundred), so I immediately trashed the old array based code and rewrite with objects. Never looked back :-) Cheers Alex > Am 17.07.2017 um 12:46 schrieb Peter Jakobsson via 4D_Tech > <4d_tech@lists.4d.com>: > > Hi > > I remember at last year’s summit, JPR was emphasising how objects were far > more optimised than arrays for doing lookups over large numbers of key value > pairs. > > e.g. we usually do this: > > $x:=find in array(myKEYS;”product_code_x”) > > if($x>0) > $0:=myPRICES{$x} > end if > > How do people prefer to do this with objects ? Enumerate the keys in some > systematic way and then populate the object like this > > > For($i;1;$SIZE) > > $key:=string($i) > $value:=myarrayVAL{$i} > OB SET($object;$key;$value) > > End For > > Then for retreiving: > > $key:=string($1) > > $0:=OB Get($object;$key) > > …or was JPR suggesting we use object arrays and do some kind of “find” over > the object arrays ? > > Best Regards > > Peter > > ** > 4D Internet Users Group (4D iNUG) > FAQ: http://lists.4d.com/faqnug.html > Archive: http://lists.4d.com/archives.html > Options: http://lists.4d.com/mailman/options/4d_tech > Unsub: mailto:4d_tech-unsubscr...@lists.4d.com > ** ** 4D Internet Users Group (4D iNUG) FAQ: http://lists.4d.com/faqnug.html Archive: http://lists.4d.com/archives.html Options: http://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4d_tech-unsubscr...@lists.4d.com **