RE: DB examples corrupted? and other questions

2003-07-13 Thread Monte Goulding
> Monte Goulding wrote:
>
> > I'm wondering if you have tested multi-dimensional arrays (stored as
> > customproperty sets)? I think the direct access nature would kill
> > tab-delimited text. What abut using something like:
> primaryKey,columnName as
> > the array key? I think it could be very fast and handle much
> more data. In
> > addition there is no risk from the delimiters.
>
> True on all fronts. Random access of specific "records" (elements in an
> array and lines in text) is much faster with arrays than with one block of
> text.
>
> However, in my case I settled on chunks because in nearly every usage I'm
> displaying only a subset of columns, and usually more than one
> record, so I
> need to query the whole data set each time.  In my benchmarks walking
> through all the elements of an array is slower than walking through a text
> block.
>
> I don't have those benchmarks handy, but if I recall "repeat for each
> element" took nearly twice as long as "repeat for each line".

That's where filter and sort on array keys would come in handy.
>
> Your code:
>
> > on mouseUp
> >   put the long seconds into tSeconds
> >   set the itemDel to tab
> >   put the cTest of this stack into tTest
> >   repeat with x=0 to 1 step 100
> > put item 25 of line x of tTest into tData
> >   end repeat
> >   put the long seconds - tSeconds into tTest1
> >   put the long seconds into tSeconds
> >   repeat with x=0 to 1 step 100
> > put the cTest[x,25] of this stack into tData
> >   end repeat
> >   put the long seconds - tSeconds into tTest2
> >   put tTest1,tTest2
> > end mouseUp
> >
> > Result 0.219,0.002
>
> ...uses the "repeat with" form, which is much slower than "repeat
> for each"
> and doesn't scale well; it takes increasingly longer as you work your way
> down through the lines, as it needs to count the number of lines each time
> through the loop.  The "repeat for each line" form runs at a
> nearly constant
> rate of lines per millisecond regardless of the size of the data set.
>
> The "repeat for each" form parses and keeps its place as it goes,
> making it
> many times faster for large text blocks.
>
I knew someone would pick me up on that.

Cheers

Monte

___
use-revolution mailing list
[EMAIL PROTECTED]
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: DB examples corrupted? and other questions

2003-07-13 Thread Richard Gaskin
Monte Goulding wrote:

> I'm wondering if you have tested multi-dimensional arrays (stored as
> customproperty sets)? I think the direct access nature would kill
> tab-delimited text. What abut using something like: primaryKey,columnName as
> the array key? I think it could be very fast and handle much more data. In
> addition there is no risk from the delimiters.

True on all fronts. Random access of specific "records" (elements in an
array and lines in text) is much faster with arrays than with one block of
text.

However, in my case I settled on chunks because in nearly every usage I'm
displaying only a subset of columns, and usually more than one record, so I
need to query the whole data set each time.  In my benchmarks walking
through all the elements of an array is slower than walking through a text
block.

I don't have those benchmarks handy, but if I recall "repeat for each
element" took nearly twice as long as "repeat for each line".

Your code:
  
> on mouseUp
>   put the long seconds into tSeconds
>   set the itemDel to tab
>   put the cTest of this stack into tTest
>   repeat with x=0 to 1 step 100
> put item 25 of line x of tTest into tData
>   end repeat
>   put the long seconds - tSeconds into tTest1
>   put the long seconds into tSeconds
>   repeat with x=0 to 1 step 100
> put the cTest[x,25] of this stack into tData
>   end repeat
>   put the long seconds - tSeconds into tTest2
>   put tTest1,tTest2
> end mouseUp
> 
> Result 0.219,0.002

...uses the "repeat with" form, which is much slower than "repeat for each"
and doesn't scale well; it takes increasingly longer as you work your way
down through the lines, as it needs to count the number of lines each time
through the loop.  The "repeat for each line" form runs at a nearly constant
rate of lines per millisecond regardless of the size of the data set.

The "repeat for each" form parses and keeps its place as it goes, making it
many times faster for large text blocks.

-- 
 Richard Gaskin 
 Fourth World Media Corporation
 Developer of WebMerge 2.2: Publish any database on any site
 ___
 [EMAIL PROTECTED]   http://www.FourthWorld.com
 Tel: 323-225-3717   AIM: FourthWorldInc

___
use-revolution mailing list
[EMAIL PROTECTED]
http://lists.runrev.com/mailman/listinfo/use-revolution


RE: DB examples corrupted? and other questions

2003-07-13 Thread Monte Goulding

> The 10,000-record "limit" is by no means absolute, and can be greatly
> exceeded depending on how your data is stored.
>

> The trick is to put tables in simple tab-delimited text in a variable
> rather than in fields on cards.  The latter is where bulk and performance
> issues come from:  the engine needs to make a text record
> structure for each
> field on every card along with a card record structure in addition to the
> data itself.  Plus, accessing data in a field is almost always much slower
> than grabbing an item from a line in a block of text.

Hey Richard

I'm wondering if you have tested multi-dimensional arrays (stored as
customproperty sets)? I think the direct access nature would kill
tab-delimited text. What abut using something like: primaryKey,columnName as
the array key? I think it could be very fast and handle much more data. In
addition there is no risk from the delimiters.

The only issue I can see is the keys would return a very long list but this
could be parsed quickly using filter etc.

Test setup script:

on mouseUp
  repeat with x=1 to 1
repeat with y=1 to 50
  put random(100) & tab after tData
end repeat
put cr into char -1 of tData
  end repeat
  set the cTest of this stack to tData
  repeat with x=1 to 1
repeat with y=1 to 50
  put random(100) into tDataA[x,y]
end repeat
  end repeat
  set the customProperties["cTest"] of this stack to tDataA
end mouseUp

Test script:

on mouseUp
  put the long seconds into tSeconds
  set the itemDel to tab
  put the cTest of this stack into tTest
  repeat with x=0 to 1 step 100
put item 25 of line x of tTest into tData
  end repeat
  put the long seconds - tSeconds into tTest1
  put the long seconds into tSeconds
  repeat with x=0 to 1 step 100
  put the cTest[x,25] of this stack into tData
  end repeat
  put the long seconds - tSeconds into tTest2
  put tTest1,tTest2
end mouseUp

Result 0.219,0.002

Food for thought ;-)

Regards

Monte

___
use-revolution mailing list
[EMAIL PROTECTED]
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: DB examples corrupted? and other questions

2003-07-12 Thread David Vaughan
On Sunday, Jul 13, 2003, at 11:41 Australia/Sydney, Richard Gaskin 
<[EMAIL PROTECTED]> wrote:
David Vaughan wrote:
snip
- ...and of course I could just do it in Rev as my DB size, while not
very small, does not push limits (<10,000 records, <10MB).
The 10,000-record "limit" is by no means absolute, and can be greatly
exceeded depending on how your data is stored.
Richard

Sorry, I expressed myself poorly. I meant that I expected my project to 
have fewer than 10K records etc, not that I envisaged some practical 
limit at that size. Your following comments on the options are very 
interesting.
The trick is to put tables in simple tab-delimited text in a variable
rather than in fields on cards.  The latter is where bulk and 
performance
issues come from:  the engine needs to make a text record structure 
for each
field on every card along with a card record structure in addition to 
the
data itself.  Plus, accessing data in a field is almost always much 
slower
than grabbing an item from a line in a block of text.

The downside to this simple text-based approach is that you have to 
write
your own routines for anything you need.

The upside is that you can use simple chunk expressions so it's easy 
to do.

If the overall size of the data is something that can be managed in 
RAM,
RAM-based solutions are hard to beat for speed over paged disk reads.
I have the RAM so give or take some data integrity issues I will think 
further about that approach. It might wind up coming to a speed test, 
or building a version in which I isolate data selection and update so 
the back end is substitutable

regards
David
--
 Richard Gaskin
 Fourth World Media Corporation
___
use-revolution mailing list
[EMAIL PROTECTED]
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: DB examples corrupted? and other questions

2003-07-12 Thread Richard Gaskin
David Vaughan wrote:

> Of the various DB options for OS X, it appears from a quick scan that:
> - mySQL is free, fast and not overly friendly to manage
> - PostGRE is free, not as fast but more complete an implementation than
> mySQL
> - Valentina is damned expensive for one user on one machine with no
> intention to sell, but very fast (and Tuviah advocates it for
> single-user databases)
> - SDB is free, rev-speed and a unique DB language (yes, I know there is
> nothing attractive about SQL anyway)
> - ...and of course I could just do it in Rev as my DB size, while not
> very small, does not push limits (<10,000 records, <10MB).

The 10,000-record "limit" is by no means absolute, and can be greatly
exceeded depending on how your data is stored.

In testing a slender db thang I'm making for a client, I've been able to run
queries with five evaluation criteria against 20,000 records in a hair over
4 secs on a G4/500 (and only 1.1 secs on a cheapo $500 Celeron-based HP
running XP -- don't get me started about the platform wars ).

The trick is to put tables in simple tab-delimited text in a variable
rather than in fields on cards.  The latter is where bulk and performance
issues come from:  the engine needs to make a text record structure for each
field on every card along with a card record structure in addition to the
data itself.  Plus, accessing data in a field is almost always much slower
than grabbing an item from a line in a block of text.

The downside to this simple text-based approach is that you have to write
your own routines for anything you need.

The upside is that you can use simple chunk expressions so it's easy to do.

If the overall size of the data is something that can be managed in RAM,
RAM-based solutions are hard to beat for speed over paged disk reads.

-- 
 Richard Gaskin 
 Fourth World Media Corporation
 Developer of WebMerge 2.2: Publish any database on any site
 ___
 [EMAIL PROTECTED]   http://www.FourthWorld.com
 Tel: 323-225-3717   AIM: FourthWorldInc

___
use-revolution mailing list
[EMAIL PROTECTED]
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: DB examples corrupted? and other questions

2003-07-12 Thread revolution

>>
Andu (Novag) use extensively SQLite beside MC apps.
<<

Thanks for that Pierre.  I came across SQLite by accident myself a couple of weeks 
ago, and didn't think to search the Metacard list to see if anyone had used it.

I know that one friend of mine I introduced to Rev is very happy now that I also 
pointed her in the direction of SQLite.  If she has any further questions with it, I 
shall point her in the direction of Andu.

David:
If you are really looking to get involved with a SQL database, don't overlook 
Firebird.  It has a great pedigree, a good support group at Yahoo groups, is 
practically maintenance-free and has so many different development tools.  For 
anyone interested, have a look at http://www.ibphoenix.com

However, if you don't need a SQL database, avoid the additional complexity and stick 
with stacks :-)

Bernard.
  

___
use-revolution mailing list
[EMAIL PROTECTED]
http://lists.runrev.com/mailman/listinfo/use-revolution


Re: DB examples corrupted? and other questions

2003-07-12 Thread Pierre Sahores
On Sun, 2003-07-13 at 03:17, David Vaughan wrote:
> Apropos these discussions about databases, I downloaded Tuviah's 
> DBExamples to have a look at Valentina but attempting to open either of 
> the two examples provided produced the error "unable to open - stack 
> corrupted" (more or less).

It's always better to gzip stacks before down/uploading them. Do anyone
know if Tuviah plan to do the same work for PostgreSQL, he did for MySQL
? I would be very interested :-)
> 
> Are these examples known to be working? I have Rev 2.0.1
> 
> Is anyone planning to write a wrapper for SQLite to be used with Rev. 
> Clearly, I am not but would be interested in using it.

Andu (Novag) use extensively SQLite beside MC apps. Ask him directly
about this.
> 
> Of the various DB options for OS X, it appears from a quick scan that:
> - mySQL is free, fast and not overly friendly to manage
> - PostGRE is free, not as fast but more complete an implementation than 
> mySQL
> - Valentina is damned expensive for one user on one machine with no 
> intention to sell, but very fast (and Tuviah advocates it for 
> single-user databases)
> - SDB is free, rev-speed and a unique DB language (yes, I know there is 
> nothing attractive about SQL anyway)
> - ...and of course I could just do it in Rev as my DB size, while not 
> very small, does not push limits (<10,000 records, <10MB).
> 
> My need is for speed followed by simplicity of use. The design looks 
> like three principal tables in the DB and considerable use of live 
> searching (e.g. for each keydown) to link data.
> 
> Any experiences or commentary welcome.
> 
> regards
> David
> 
> ___
> use-revolution mailing list
> [EMAIL PROTECTED]
> http://lists.runrev.com/mailman/listinfo/use-revolution
-- 
Bien cordialement, Pierre Sahores

Serveurs d'applications & bases ACID SQL
Penser et produire l'avantage compétitif

___
use-revolution mailing list
[EMAIL PROTECTED]
http://lists.runrev.com/mailman/listinfo/use-revolution