Re: Arrays vs Object for Key/Value pair lookups

2017-07-20 Thread David Adams via 4D_Tech
Hey John! Glad to see you're still digging into details.

C_OBJECT is great, I use it all the time. It's always nice to have more
choices in 4D. (You know more languages than most people, so you
doubtlessly were already missing something like C_OBJECT.) I posted speed
numbers a few days ago with some code that anyone can test for themselves
and how I interpret them. For me, I'll use C_OBJECT when it's easier and
sorted arrays when they're easier. The speed difference between access into
sorted arrays and C_OBJECT is trivial to the point of irrelevance. (Arrays
are quicker to set up, C_OBJECT may be a tiny bit faster for lookups. But
both of those things "depend".)

I like the example you provided and agree it's a perfect case for C_OBJECT.
Web input management is a problem that cries out for a simple lookup
structure. One nice thing about C_OBJECT over sorted arrays is that you
don't have to worry about sort order. One nice thing about arrays is that
the access syntax is better than some of the available C_OBJECT access
options.

As an example of when I find arrays easier, I've got one. Well, I've got
lots, but I've got an interesting one. I like to keep an error stack. This
way I can run a bunch of lines of code to validate inputs, etc. as a block
and then check the number of errors after the block. No errors? Carry on!
1+ error(s), deal with it. Sometimes I want to dump everything, most of the
time all I want is a count or to get the most recent error details. Now
this has nothing to do with arrays and associative arrays per se, but in 4D
arrays inside of objects are a bit of a pain. If I want to count the number
of items I either need to stash a counter in the object and pull it out, or
pull the whole array out and get it's size. If I want the last item, I have
to pull the whole array out and get the item out of the array. So I keep
the data in an array to make counts and item access simple. But what about
a full dump of the whole stack? It's nice to have it all in one object, so
I embed the array into an object. (It's an array of objects, as it turns
out.) I've got some methods around all of this data so the rest of the
world doesn't have to deal with the issues.

So, given the current state of play, I'm using C_OBJECT, ARRAY OBJECT, and
sorted arrays. I'm not using object fields and don't think that I've got
any cases for them in 4D. If I really need to store JSON closer to an
external application, I can pipe it over to Postgres in a 'text' field,
which compresses nicely. (Postgres 'json' and 'jsonb' fields don't sound
like they compress well at all. That makes sense to me for jsonb but not
for json. But what do I know?)
**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: Arrays vs Object for Key/Value pair lookups

2017-07-20 Thread John Baughman via 4D_Tech
I have been loosely following this thread and thought I would share my first 
use of C_OBJECT. Many years ago I inherited a 4D database that David Adams, yes 
you David, had done some work on. Specifically he had crated a web interface 
for it and used much of what he talked about in one of his early books on the 
subject of 4D and the web.

David used a set of routines to parse both incoming URL parameters name value 
pairs, as well as form field name value pairs into parallel arrays. I just 
recently modified his code to instead parse them into C_Objects. I did this as 
I was about to implement a very large complex query to respond to requests for 
records from a table of 50,000 records with up to 30 possible filters or search 
parameters involving multiple related tables. 

I must say that using C_Objects for the query made building the query much 
easier and with fewer lines of code. In the web app the filters are loaded into 
Dictionaries, which I combine into a single dictionary which is sent as a form  
in the request to 4D. Parsed into a C_Object…

QUERY([Project];[Project]ID>0;*)

If (OB Get(oFormFieldPrameters;"pfTitle")#"")
QUERY([Project]; & ;[Project]Project_Title=OB 
Get(oFormFieldPrameters;"pfTitle";*))

End if 

If (OB Get(oFormFieldPrameters;"pfConsultants")#"")
QUERY([Project]; & ;[Project]Consultants=OB 
Get(oFormFieldPrameters;"pfConsultants";*))

End if 

  //. etc. 30 times

QUERY([Project])

Could of done it with the parallel arrays, but this is far more elegant and 
perhaps even more efficient.

John


 
> On Jul 19, 2017, at 11:39 PM, JPR via 4D_Tech <4d_tech@lists.4d.com> wrote:
> 
> 
> 
> [JPR]
> 
> Hi Guys,
> 
> The exact thing that I've explained was how to use objects to get Associative 
> Arrays in 4D. Associative Arrays are widely used in other languages like PHP 
> or JavaScript.
> 
> In computer science, an Associative Array  is an abstract data type composed 
> of a collection of (key, value) pairs, such that each possible key appears 
> just once in the collection. In 4D, the JSON-type Objects are perfect for 
> Associative Arrays. 
> 
> In many case, you will have 2 parallel arrays, let's say one for the product 
> code ($arCodes), and one for the product name ($arNames). You want to find 
> the name for a specific code (classic 4D way)
> 
> - You create the array with a loop, adding elements pairs $myCode and $myName:
> 
> $k:=Find in array($arCodes;$myCode) 
> If ($k>0)
>   $arNames{$k}:=$myName 
> Else
>   APPEND TO ARRAY($arCodes;$myCode)
>   APPEND TO ARRAY($arNames;$myName) 
> End if
> 
> - Then you can find a name from a code:
> 
> $k:=Find in array($arCodes;$myCode) 
> If ($k>0)
>   $myName:=$arNames{$k} 
> Else
>   $myName:="" 
> End if
> 
> Now if you use an Object:
> 
> C_OBJECT($myArray)
> 
> You create the array with a loop:
> 
> OB SET($myArray;$myCode;$myName) 
> 
> -and you find with:
> 
> $myName:=OB Get($myArray;$myCode} 
> 
> ...much simpler, and much faster! Why is it faster? With a classic array, the 
> Find in array command has to parse the entire array, element per element, 
> until the correct element is found (Except in case of a Sorted array, but 
> sometimes you can't sort the array because the index of an element can be 
> meaningful for your method)
> 
> In case of an object, the properties are 'indexed' by using an internal Hash 
> table, so the access to one particular Property doesn't need a sequential 
> parsing of the list of values, but an almost direct access. I confirm what 
> Justin says, that is to say that the bigger will be the array, the more 
> efficient will be associative arrays compared with classic parsing of arrays.
> 
> My very best,
> 
> JPR
> 
> 
> 
>> Message: 7
>> Date: Mon, 17 Jul 2017 11:43:12 -0700
>> From: Justin Leavens 
>> To: 4D iNug Technical <4d_tech@lists.4d.com>
>> Subject: Re: Arrays vs Object for Key/Value pair lookups
>> Message-ID:
>>  
>> Content-Type: text/plain; charset="UTF-8"
>> 
>> I did a 2014 Summit presentation (5 JSON Tips) which should be available
>> for download that demonstrated the benefits of using objects for key/value
>> pair cache lookups, but in the end it’s pretty easy to demonstrate. The
>> benefits start to show up with a few hundred keys, but at 100,000 it’s
>> easily 20x faster looking up object keys as opposed to find in array in
>> interpreted. And when you compile, it’s literally hundred of times faster
>> (400-500x) at 100k keys - and the benefits just get bigger and bigger with
>> more

Re: Arrays vs Object for Key/Value pair lookups

2017-07-20 Thread Peter Jakobsson via 4D_Tech

On 20 Jul 2017, at 11:39, JPR via 4D_Tech <4d_tech@lists.4d.com> wrote:

> In case of an object, the properties are 'indexed' by using an internal Hash 
> table, so the access to one particular Property doesn't need a sequential 
> parsing of the list of values, but an almost direct access. I confirm what 
> Justin says, that is to say that the bigger will be the array, the more 
> efficient will be associative arrays compared with classic parsing of arrays.


Many thanks for refreshing your advice JPR !

Very useful.

Regards

Peter

**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: Arrays vs Object for Key/Value pair lookups

2017-07-20 Thread JPR via 4D_Tech


[JPR]

Hi Guys,

The exact thing that I've explained was how to use objects to get Associative 
Arrays in 4D. Associative Arrays are widely used in other languages like PHP or 
JavaScript.

In computer science, an Associative Array  is an abstract data type composed of 
a collection of (key, value) pairs, such that each possible key appears just 
once in the collection. In 4D, the JSON-type Objects are perfect for 
Associative Arrays. 

In many case, you will have 2 parallel arrays, let's say one for the product 
code ($arCodes), and one for the product name ($arNames). You want to find the 
name for a specific code (classic 4D way)

- You create the array with a loop, adding elements pairs $myCode and $myName:

$k:=Find in array($arCodes;$myCode) 
If ($k>0)
$arNames{$k}:=$myName 
Else
APPEND TO ARRAY($arCodes;$myCode)
APPEND TO ARRAY($arNames;$myName) 
End if

- Then you can find a name from a code:

$k:=Find in array($arCodes;$myCode) 
If ($k>0)
$myName:=$arNames{$k} 
Else
$myName:="" 
End if

Now if you use an Object:

C_OBJECT($myArray)

You create the array with a loop:

OB SET($myArray;$myCode;$myName) 

-and you find with:

$myName:=OB Get($myArray;$myCode} 

...much simpler, and much faster! Why is it faster? With a classic array, the 
Find in array command has to parse the entire array, element per element, until 
the correct element is found (Except in case of a Sorted array, but sometimes 
you can't sort the array because the index of an element can be meaningful for 
your method)

In case of an object, the properties are 'indexed' by using an internal Hash 
table, so the access to one particular Property doesn't need a sequential 
parsing of the list of values, but an almost direct access. I confirm what 
Justin says, that is to say that the bigger will be the array, the more 
efficient will be associative arrays compared with classic parsing of arrays.

My very best,

JPR



> Message: 7
> Date: Mon, 17 Jul 2017 11:43:12 -0700
> From: Justin Leavens 
> To: 4D iNug Technical <4d_tech@lists.4d.com>
> Subject: Re: Arrays vs Object for Key/Value pair lookups
> Message-ID:
>   
> Content-Type: text/plain; charset="UTF-8"
> 
> I did a 2014 Summit presentation (5 JSON Tips) which should be available
> for download that demonstrated the benefits of using objects for key/value
> pair cache lookups, but in the end it’s pretty easy to demonstrate. The
> benefits start to show up with a few hundred keys, but at 100,000 it’s
> easily 20x faster looking up object keys as opposed to find in array in
> interpreted. And when you compile, it’s literally hundred of times faster
> (400-500x) at 100k keys - and the benefits just get bigger and bigger with
> more keys. That’s both for filling the cache and for retrieving values
> (objects save you from having to check if a key is already in the array
> before adding it).
> 
> --
> Justin Leavens
> jus...@jitbusiness.com   (818) 986-7298 x 701
> Just In Time Consulting, Inc.
> Custom software for unique businesses
> http://www.linkedin.com/in/justinleavens
> 
> On July 17, 2017 at 3:46:26 AM, Peter Jakobsson via 4D_Tech (
> 4d_tech@lists.4d.com) wrote:
> 
> Hi
> 
> I remember at last year’s summit, JPR was emphasising how objects were far
> more optimised than arrays for doing lookups over large numbers of key
> value pairs.
> 
> e.g. we usually do this:
> 
> $x:=find in array(myKEYS;”product_code_x”)
> 
> if($x>0)
> $0:=myPRICES{$x}
> end if
> 
> How do people prefer to do this with objects ? Enumerate the keys in some
> systematic way and then populate the object like this >
> 
> For($i;1;$SIZE)
> 
> $key:=string($i)
> $value:=myarrayVAL{$i}
> OB SET($object;$key;$value)
> 
> End For
> 
> Then for retreiving:
> 
> $key:=string($1)
> 
> $0:=OB Get($object;$key)
> 
> …or was JPR suggesting we use object arrays and do some kind of “find” over
> the object arrays ?
> 
> Best Regards
> 
> Peter
> 

**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: Arrays vs Object for Key/Value pair lookups

2017-07-17 Thread Alan Chan via 4D_Tech
Case sensitive comparison support in Find in array/Find in sorted array is long 
overdue. Even best, supported in string compariosn operator (or new operator 
for string)

$true:=($string1=*$string2)
$true:=($string1>*$string2)
$true:=($string1<*$string2)

Alan Chan

4D iNug Technical <4d_tech@lists.4d.com> writes:
>the "Find" commands accept wild cards and evaluate using collation algorithms 
>(case-insensitive comparison plus some other locale specific rules)
>is it really fair to compare the two against object keys?

**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: Arrays vs Object for Key/Value pair lookups

2017-07-17 Thread David Adams via 4D_Tech
> "Find" commands accept wild cards and evaluate using collation algorithms
(case-insensitive comparison plus some
 > other locale specific rules) is it really fair to compare the two
against object keys?

I'm not sure what "fair" means here, but it's definitely not a
apples-to-apples comparison. Find in array and Find in sorted array are
variants on the same thing, so they're easy to compare. Object keys? No
idea. As far as I can tell, 4D refused to offer any information about how
the work that the company will stand behind publicly. Thats okay, I'm used
to black box testing and I like it...but it's time-consuming.

The point of these comparisons isn't to figure out if one approach is
"better" than another so much as how they work *in the real world.* Put
another way, the goal is to come up with some rules of thumb about what to
use when. Binary search kicks ass, and I know why. Object key lookups kick
ass, and I don't know why. My take-away is to

* Use objects when when they're easier or more appropriate for the problem
at hand.

* Use sorted arrays when when they're easier or more appropriate for the
problem at hand.

* Don't shift from arrays to objects based on a notion that they're
"faster."

* Consider objects instead of arrays if you don't have or can't be sure of
a sorted array order because object key lookups are way faster than
sequential array traversal (Find in array.)

* Don't worry about speed at all unless you've got a solid reason to.

Thinking best on my tests, a few points for anyone that wants to tweak them:

-- If you want sequential searches to look better, just search for the
first items. Search time should be directly related to the position of the
target in the array. I avoided this trap on purpose.

-- I used very small text values for lookups and keys! Long strings might
behave differently, I don't know. I would actually find that an interesting
result, if anyone feels like checking.

-- The object keys are inserted into the test object in sorted order. This
should not make any difference if there's a hash underneath, but we don't
actually know that. Although it does see likely. From the few results I've
gotten, I'd wildly guess that there's:

-- An excellent hash function where "excellent" means "low collision, high
dispersal and fast."

-- A secondary structure off the hash bins that is itself smart. So, not a
linked list (The CS 101 approach), but a second good hash or a tree of some
kind. Or something else.

-- A pretty large range of hash slots to reduce secondary lookup times.

-- Probably some smart scheme for changing the hash table size dynamically
under stress. That's an expensive maneuver (or normally is, I can think of
ways to make it not too expensive.)

Just speculating, I'm probably wrong in every detail here. Doesn't matter.
It's a black box.
**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: Arrays vs Object for Key/Value pair lookups

2017-07-17 Thread Keisuke Miyako via 4D_Tech
the "Find" commands accept wild cards and evaluate using collation algorithms 
(case-insensitive comparison plus some other locale specific rules)
is it really fair to compare the two against object keys?

> 2017/07/18 9:44、David Adams via 4D_Tech <4d_tech@lists.4d.com> のメール:
>
> * Sequential Find in array
> * Binary Find in sorted array
> * Object lookup




**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: Arrays vs Object for Key/Value pair lookups

2017-07-17 Thread David Adams via 4D_Tech
Hello all, I've been interested in this topic for some time but have never
taken the time to run any tests. I don't have the time now (for sure), but
I took some anyway. What I did was grab a huge unique word file, clear out
words that are obviously illegal JSON key names and tried doing lookups
three ways:

* Sequential Find in array
* Binary Find in sorted array
* Object lookup

For reference, here's a link to the original file full of words:
https://github.com/dwyl/english-words/blob/master/words.zip

I tried not to bias the tests as what I'd like are useful results. Still, I
didn't test a whole lot of different ways and bias is nearly impossible to
avoid. Even if my test is totally fair, there's no way that it's complete -
results always depend on the data under test. This is part of why
algorithms are described using big O terminology. You have a way of talking
about the performance envelope around the algorithm under various sorts of
conditions. (That's a terrible description, but so be it.) As an example,
it's very easy to compare s sequential find in array with a binary search.
There are only a couple of cases where sequential is faster, no matter the
size of the array. With 4D's object lookups, we just don't know.  Even if
they are a hash table (likely but not confirmed), this doesn't tell us
much. (Hash tables have a whole lot of components in their implementation,
some of which can behave in weird ways, depending on your data set+hashing
function. It also matters what you use to find actual values, not just hash
bins.)

Anyway, here are a set of results in a compiled system with ~465,000
words/keys:

Words: 466,474
Tests: 10,000
Sequential: 107,777
Binary: 153
Object: 9

The three times are in milliseconds. As in, "Searching for 10,000 different
words in an array of 466,474 unique words took a little over 1/10th of a
second using a sorted array." That's roughly 1/2 of a blink. (Not kidding.)

Comments and take-aways:

* Binary search is great.

* Object search is great.

* Sequential search is not so great, but it still only took about 11
seconds.

* I noticed that setting up the sorted array took no time and that setting
up the object took time that I could feel. I didn't do timing results on
this. But if it's true, the *overall* time (including setup) for the object
was *unfavorable.*

Conclusion: I'll use objects when I need them and sorted arrays when I need
them. The performance difference is too small to be a factor, it will come
down to other properties of these data structures.

++ To Justin on the whole 'use the lookup value as the key' tip (Rob has
mentioned this too.) I sue that all of the time in objects, it's a really
excellent practice.

If anyone wants to re-run the tests or check my code for logic errors, dumb
errors, bias, etc., here's the code with comments:

If (False)
 // https://github.com/dwyl/english-words/blob/master/words.zip
 // Imported into a new table in 4D.
 // Cleared ones starting with numbers or punctuation.
 // 466,475 words left.
End if

  //
  // Setup
  //
ALL RECORDS([word])
ORDER BY([word];[word]word) // 4D indexed sort

ARRAY TEXT($words_at;0)
SELECTION TO ARRAY([word]word;$words_at)  // Sorted arraa of 464K+ words

C_OBJECT($words_object)
C_LONGINT($words_count)
C_LONGINT($word_index)
$words_object:=JSON Parse("{}")
$words_count:=Size of array($words_at)

For ($word_index;1;$words_count)
C_TEXT($word)
$word:=$words_at{$word_index}
 // {"hello":"HELLO"} - no reason for the lower/upper other than to make it
read in the Debugger.
OB SET($words_object;Lowercase($word);Uppercase($word))
End for

  // Let's build an array of random words from the main array of words.
C_LONGINT($test_words_count)
$test_words_count:=1  // Note: Cannot be larger than the $words_count


  // Hmmm. Not getting a good distribution of indexes from Random.
  // Instead, I'll pick words from different positions along the array.

ARRAY TEXT($test_words_at;$test_words_count)
$test_words_at{1}:=$words_at{1}  // Best case for a sequential scan
$test_words_at{2}:=$words_at{$words_count}  // Worst case for a sequential
scan

C_LONGINT($interval)
  // Now we want to fill in the rest of the test array.
  // The selected words are grabbed from even intervals along the array.

  // The speed difference for a sequential search should be linear.

  // The speed difference for a binary search should be very small amongst
words.
  // It should take up to about ~18 reads to find the word.

  // The speed difference for the object? No clue, we don't know how
they're implemented.
  // With a fast hash and a large hash table, it could be very quick. Hard
to say.
$interval:=$words_count\$test_words_count

C_LONGINT($test_word_index)
For ($test_word_index;3;$test_words_count)  // Start at 3 because we just
filled in 1 & 2 by hand.
C_LONGINT($word_index)
$word_index:=$interval*

Re: Arrays vs Object for Key/Value pair lookups

2017-07-17 Thread Justin Leavens via 4D_Tech
I did a 2014 Summit presentation (5 JSON Tips) which should be available
for download that demonstrated the benefits of using objects for key/value
pair cache lookups, but in the end it’s pretty easy to demonstrate. The
benefits start to show up with a few hundred keys, but at 100,000 it’s
easily 20x faster looking up object keys as opposed to find in array in
interpreted. And when you compile, it’s literally hundred of times faster
(400-500x) at 100k keys - and the benefits just get bigger and bigger with
more keys. That’s both for filling the cache and for retrieving values
(objects save you from having to check if a key is already in the array
before adding it).

--
Justin Leavens
jus...@jitbusiness.com   (818) 986-7298 x 701
Just In Time Consulting, Inc.
Custom software for unique businesses
http://www.linkedin.com/in/justinleavens

On July 17, 2017 at 3:46:26 AM, Peter Jakobsson via 4D_Tech (
4d_tech@lists.4d.com) wrote:

Hi

I remember at last year’s summit, JPR was emphasising how objects were far
more optimised than arrays for doing lookups over large numbers of key
value pairs.

e.g. we usually do this:

$x:=find in array(myKEYS;”product_code_x”)

if($x>0)
$0:=myPRICES{$x}
end if

How do people prefer to do this with objects ? Enumerate the keys in some
systematic way and then populate the object like this >

For($i;1;$SIZE)

$key:=string($i)
$value:=myarrayVAL{$i}
OB SET($object;$key;$value)

End For

Then for retreiving:

$key:=string($1)

$0:=OB Get($object;$key)

…or was JPR suggesting we use object arrays and do some kind of “find” over
the object arrays ?

Best Regards

Peter

**
4D Internet Users Group (4D iNUG)
FAQ: http://lists.4d.com/faqnug.html
Archive: http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub: mailto:4d_tech-unsubscr...@lists.4d.com
**
**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: Arrays vs Object for Key/Value pair lookups

2017-07-17 Thread Peter Jakobsson via 4D_Tech

On 17 Jul 2017, at 17:03, Herr Alexander Heintz via 4D_Tech 
<4d_tech@lists.4d.com> wrote:

> so I queried for the language I needed and then
> apply to selection([dict];ob set(<>Dict;[dict]WordKey;[dict]Word)

Ah !

So you just ‘hoover up’ into your dictionary object.

Like a hoover ?

Peter

**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: Arrays vs Object for Key/Value pair lookups

2017-07-17 Thread Herr Alexander Heintz via 4D_Tech
That’s basically it.
only I don not need the wrapper anymore, i go directly to 
word:=OB Get(<>Dict;$t_MyKey;is Text)
Using arrays I sorted the key array and used my own optimized array query 
routine (same as the new Find in sorted array introduced in V16).
With object, no need to sort, the object system optimizes it by itself.
Only no need to calculate as my dictionary table is quite simple:

WordKey
Language
Word

so I queried for the language I needed and then

apply to selection([dict];ob set(<>Dict;[dict]WordKey;[dict]Word)

ready
set
go

could not be conceivably easier

cheers

> Am 17.07.2017 um 16:45 schrieb Peter Jakobsson via 4D_Tech 
> <4d_tech@lists.4d.com>:
> 
> Thanks Alexander.
> 
> Which style of implementation did you use ? Did you use the old array lookup 
> key as the new object key in the key/value pair ? i.e. did you enumerate the 
> keys like this: ?
> 
> === OLD WAY ===
> 
> ARRAY LONGINT(vArrKeysID; 1000)
> ARRAY LONGINT(vArrKeysNames; 1000)
> 
> $x:=Find in Array(vArrKeysID;345)
> 
> If($x>0)
> $0:= vArrKeysNames{$x}
> End if
> 
> === NEW WAY ===
> 
> C_OBJECT($myOBJECT)
> 
> For($i;1;1000)
> 
> $key:=String($i)
> $value:=$i
> OB SET($myOBJECT;$key;$value)
> 
> End For
> 
> …then for finding (passing the ID in $1:
> 
> $key:=string($1)
> 
> $0:=ob get($myOBJECT;$key)
> 
> ==
> 
> Is that how you did it ? (i.e. with calculated/hashed keys).
> 
> Peter
> 
> 
> On 17 Jul 2017, at 13:17, Herr Alexander Heintz via 4D_Tech 
> <4d_tech@lists.4d.com> wrote:
> 
>> Using objects was MAGNITUDES faster than synchronised arrays
> 
> **
> 4D Internet Users Group (4D iNUG)
> FAQ:  http://lists.4d.com/faqnug.html
> Archive:  http://lists.4d.com/archives.html
> Options: http://lists.4d.com/mailman/options/4d_tech
> Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
> **

**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: Arrays vs Object for Key/Value pair lookups

2017-07-17 Thread Peter Jakobsson via 4D_Tech
Thanks Alexander.

Which style of implementation did you use ? Did you use the old array lookup 
key as the new object key in the key/value pair ? i.e. did you enumerate the 
keys like this: ?

=== OLD WAY ===

ARRAY LONGINT(vArrKeysID; 1000)
ARRAY LONGINT(vArrKeysNames; 1000)

$x:=Find in Array(vArrKeysID;345)

If($x>0)
$0:= vArrKeysNames{$x}
End if

=== NEW WAY ===

C_OBJECT($myOBJECT)

For($i;1;1000)

 $key:=String($i)
 $value:=$i
 OB SET($myOBJECT;$key;$value)

End For

…then for finding (passing the ID in $1:

$key:=string($1)

$0:=ob get($myOBJECT;$key)

==

Is that how you did it ? (i.e. with calculated/hashed keys).

Peter


On 17 Jul 2017, at 13:17, Herr Alexander Heintz via 4D_Tech 
<4d_tech@lists.4d.com> wrote:

> Using objects was MAGNITUDES faster than synchronised arrays

**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: Arrays vs Object for Key/Value pair lookups

2017-07-17 Thread Keith Culotta via 4D_Tech
Take a look at the new 
http://livedoc.4d.com/4D-Language-Reference-15.4/Arrays/Find-in-sorted-array.301-3274895.en.html.
  This would change the FIA side of the equation.

Keith - CDI

> On Jul 17, 2017, at 5:46 AM, Peter Jakobsson via 4D_Tech 
> <4d_tech@lists.4d.com> wrote:
> 
> Hi
> 
> I remember at last year’s summit, JPR was emphasising how objects were far 
> more optimised than arrays for doing lookups over large numbers of key value 
> pairs.
> 
> e.g. we usually do this:
> 
> $x:=find in array(myKEYS;”product_code_x”)
> 
> if($x>0)
>  $0:=myPRICES{$x}
> end if
> 
> How do people prefer to do this with objects ? Enumerate the keys in some 
> systematic way and then populate the object like this >
> 
> For($i;1;$SIZE)
> 
>  $key:=string($i)
>  $value:=myarrayVAL{$i}
>  OB SET($object;$key;$value)
> 
> End For
> 
> Then for retreiving:
> 
> $key:=string($1)
> 
> $0:=OB Get($object;$key)
> 
> …or was JPR suggesting we use object arrays and do some kind of “find” over 
> the object arrays ?
> 
> Best Regards
> 
> Peter
> 
> **
> 4D Internet Users Group (4D iNUG)
> FAQ:  http://lists.4d.com/faqnug.html
> Archive:  http://lists.4d.com/archives.html
> Options: http://lists.4d.com/mailman/options/4d_tech
> Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
> **

**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: Arrays vs Object for Key/Value pair lookups

2017-07-17 Thread David Adams via 4D_Tech
Interesting subject.

Just to make sure I am able to interpret the the findings correctly, were
you comparing Find in array with object find, or *binary* find in array on
a sorted array? Binary search is very hard (but certainly not impossible)
to compete with. If you have an array of 1,000,000 elements, it takes
something like 36 operations max to find a value. If you have an unsorted
array of 1,000,000 items then it can take 1,000,000 comparisons to check
for a value.

Kind of a big deal.

A naive, sequential Find in array and a smart binary search on a sorted
array are *very* different animals. Conflating the two makes search results
based on one meaningless.

That's why I'm trying to sort out which of these animals you were comparing
with searches on objects. Object may be using some kind of hash table
which, for sure, ought to beat a sequential find in array. We don't know
how many buckets are in the hash table, but say that it's 4,096. You cut
your initial search space down to roughly 256 values. (This could be 0
values or it could be 1,000 - it depends on the data and the hashing
function.) That gives you a *massive* optimization very inexpensively. It's
still hard to beat a binary search under a normal distribution of values
and searches, but it's still way faster than a sequential search.

Then again, we don't actually know *anything* about the way object searches
work in 4D so anything is possible. 4D won't say anything on the subject
for reasons they will not discuss. I find this completely puzzling, but
there it is.
**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: Arrays vs Object for Key/Value pair lookups

2017-07-17 Thread Herr Alexander Heintz via 4D_Tech
I did a lot of testing for this as I need to keep a dictionary of words 
identified by word IDs with some 300 000 items around.
I need to retrieve the words based on their ID.
Using objects was MAGNITUDES faster than synchronised arrays (Cannot find the 
number anymore but we are talking measurable differences here, 1ms to several 
hundred), so I immediately trashed the old array based code and rewrite with 
objects.
Never looked back :-) 

Cheers
Alex

> Am 17.07.2017 um 12:46 schrieb Peter Jakobsson via 4D_Tech 
> <4d_tech@lists.4d.com>:
> 
> Hi
> 
> I remember at last year’s summit, JPR was emphasising how objects were far 
> more optimised than arrays for doing lookups over large numbers of key value 
> pairs.
> 
> e.g. we usually do this:
> 
> $x:=find in array(myKEYS;”product_code_x”)
> 
> if($x>0)
>  $0:=myPRICES{$x}
> end if
> 
> How do people prefer to do this with objects ? Enumerate the keys in some 
> systematic way and then populate the object like this >
> 
> For($i;1;$SIZE)
> 
>  $key:=string($i)
>  $value:=myarrayVAL{$i}
>  OB SET($object;$key;$value)
> 
> End For
> 
> Then for retreiving:
> 
> $key:=string($1)
> 
> $0:=OB Get($object;$key)
> 
> …or was JPR suggesting we use object arrays and do some kind of “find” over 
> the object arrays ?
> 
> Best Regards
> 
> Peter
> 
> **
> 4D Internet Users Group (4D iNUG)
> FAQ:  http://lists.4d.com/faqnug.html
> Archive:  http://lists.4d.com/archives.html
> Options: http://lists.4d.com/mailman/options/4d_tech
> Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
> **

**
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**