RE: [U2] [UD] Indexing 'Best Practices'

2006-08-02 Thread Kevin King
From: David Wolverton
Would we be better off to create a Virtual field definition to
index that carries the 'real' code into the index, unless it is
COMPLETE, in which case we would make the virtual field pass a NULL
and then index this new virtual field using NoNulls?

Assuming COMPLETE can't be changed to null in the app, yes.  One
consideration you haven't discussed is the amount of impact on the
existing app, and this strategy would minimize that impact.  I
wouldn't go writing a subroutine to do the calculation, but with a
simple IF the overhead should be minimal.

But it does not appear to me that the 'size' of the index created has
significant operational cost ...

The size of the index - that is, number of records indexed - does in
fact have an operational cost.  Minimize the size of the index -
particularly if a large percent of the records don't have to be in
there, and performance will positively reflect it.  If you're only
dropping a small percentage of the records from the index with
NO.NULLS you might not notice a difference, but drop a large
percentage and I bet you'd notice the difference.

 possibly not as much the expense of doing all those Virtual Field
computations along the way 

If it's a subroutine, then I would expect the computations to have
some expense.  A simple IF should have minimal impact.

Plus, if we ever DO need it, COMPLETE is indexed for selection.

If COMPLETE is needed, you might consider a separate indexed field
that uses COMPLETE with some other criteria that may be useful for
reporting.  At least then you're starting a new index tree independent
from the other one.

Not sure if this qualifies as best practice, but that would be the
direction I would lean.

-Kevin
[EMAIL PROTECTED]
http://www.PrecisOnline.com
 
** Check out scheduled Connect! training courses at
http://www.PrecisOnline.com/train.html.
---
u2-users mailing list
u2-users@listserver.u2ug.org
To unsubscribe please visit http://listserver.u2ug.org/


Re: [U2] [UD] Indexing 'Best Practices'

2006-08-02 Thread Kate Stanton
I would go for the simplest solution to operate and maintain, almost to hell 
with machine efficiency.  You (or anyone else trying to follow you) are 
probably the most critical and expensive resource.


- Original Message - 
From: David Wolverton [EMAIL PROTECTED]

To: u2-users@listserver.u2ug.org
Sent: Wednesday, August 02, 2006 1:59 PM
Subject: [U2] [UD] Indexing 'Best Practices'



I have a scenario to describe and would like comments as to what others see
as a 'Best Practice' for Indexing. We're on UniData, currentish Windows
version, and have indexes in place on files that are starting to grow at
larger client sites and are heavily modified.

Here is a common index issue we face on some files, and I'd like thoughts
and comments...

A file contains about 500,000 records, growing 100,000 records a year, 
(but

I need to think about 5,000,000 growing 1,000,000 a year as well) and we
have indexed the record 'status' so that we can quickly grab the records
that are in varying stages of completion.  This status field occupies a
'real' attribute and contains entries like  OK, PEND, MAIL, WAIT, ENTRY, 
and

so on. As well, every record at some point and time is finally COMPLETE.
These statuses change ofen - so a given record may travel through 5 to 15
status changes before it becomes COMPLETE.

So we have tens, and hundreds of records (and have to think about 
thousands)
in the different status 'buckets' and can grab those instantly via a 
SELECT

- and that is great and works like a champ -- except that we are also
carrying an index for COMPLETE which is just enormous (and essentially
useless) - every record in the file except for the currently active is
stamped COMPLETE.

We would never have a reason to 'look' at all complete records via this
index, as to do so would be almost the entire file. So if we did not have
this index, I don't see that it would be a problem.

SO... The choice:  Considering Time/Resources/Better Living... Would we be
better off to create a Virtual field definition to index that carries 
the

'real' code into the index, unless it is COMPLETE, in which case we would
make the virtual field pass a NULL and then index this new virtual field
using NoNulls?  Doing so would remove the COMPLETEs from the index - Would
the computation and resources be better served by getting rid of the index
for COMPLETE?  Or would be time spent 'deriving' the Indexes be more
'expensive' in the long run considering these statuses change a lot, which
means the computation would have to be run each time with the only 'net'
saving is the write to the COMPLETE index?

From my view, the 'cost' to remove COMPLETE via a Virtual Field would only
save a disk operation in the life of the record - that is, the actual
'write' of the COMPLETE index record. That and some space on the disk. But
it does not appear to me that the 'size' of the index created has
significant operational cost - possibly not as much the expense of doing 
all

those Virtual Field computations along the way .  The way B+Tree works
(according to WikiPedia!), the system would just keep adding layers, and 
an

insert that triggers a B+Tree 'split' would cause at worst 3 or so extra
disk writes it appears - not a complete cascade. Plus, if we ever DO need
it, COMPLETE is indexed for selection.

Or are disk writes evil incarnate and the Virtual Field removal of that
operations is good riddance?

I see arguments both ways -- Thoughts?  I'm sure others have a similar
issue, and I'm wondering what has been found to be the 'best practice' 
here.


David Wolverton
---
u2-users mailing list
u2-users@listserver.u2ug.org
To unsubscribe please visit http://listserver.u2ug.org/

---
u2-users mailing list
u2-users@listserver.u2ug.org
To unsubscribe please visit http://listserver.u2ug.org/


RE: [U2] [UD] Indexing 'Best Practices'

2006-08-02 Thread David Wolverton
OK - sold!  Virtual field to move COMPLETE to NULL is the answer... As I was
driving back to the hotel after doing that post last night, I got to
thinking that the Virtual Field computation would strictly be CPU/Memory,
which means I can do A LOT of that for the cost of a single disk access.
And with the B+Tree design, we are doomed to at least TWO disk reads in
every case (unless you happen to hit the Root record!) - the first to the
'lead' block, and then a read either 'left' or 'right' - and fresh from
sleep, I can see that as the index grew, that could turn into 3, 4, or 50
traversals (aka Disk reads) left or right to position the key on write. The
leaf size is 4K - PERIOD. So every 100 or so records in a leaf, there would
be some sort of shift of level push that would involve even more write
traffic.

More than likely I'll create a new dictionary that is a Virtual Field, and
index that NO.NULLS and change the routines that need expedited selects that
we were using the index for anyway - any report that would allow COMPLETE
as an option will probably have to go through the whole file, so we may have
to create 'versions' of the report - a faster Select Only From
non-COMPLETE Version that can use Indexes, and a slow, but inclusive Can
Include COMPLETE version which will use the unindexed attribute.

Thanks for all who responded on and off list!  If anyone else knows magic
that goes on behind the scenes on indexes, or has recommendations, please
continue to chime in!
---
u2-users mailing list
u2-users@listserver.u2ug.org
To unsubscribe please visit http://listserver.u2ug.org/


Re: [U2] [UD] Indexing 'Best Practices'

2006-08-02 Thread Raymond DeGennaro II

At 20:59 -0500 2006/08/01, David Wolverton wrote:

So we have tens, and hundreds of records (and have to think about thousands)
in the different status 'buckets' and can grab those instantly via a SELECT
- and that is great and works like a champ -- except that we are also
carrying an index for COMPLETE which is just enormous (and essentially
useless) - every record in the file except for the currently active is
stamped COMPLETE.

We would never have a reason to 'look' at all complete records via this
index, as to do so would be almost the entire file. So if we did not have
this index, I don't see that it would be a problem.


How often is the information in the COMPLETE records needed?
Will a COMPLETE record ever be changed to another status?
How often will you need to mix data from COMPLETE records and records 
with other statuses?


If a COMPLETE record is essentially useless (except for historical 
purposes), I'd consider moving records out of POTENTIALLY_HUGE_FILE 
and into POTENTIALLY_HUGE_FILE_ARCHIVE as soon as they're COMPLETE.


That solves the index problem and keeps the working file smaller 
and/or less full.


Ray

--
.=.
| =-=-=-=-=-=-= Eagle Rock Information Systems Corp =-=-=-=-=-=-= |
| -=-=-=-=-=-=- web and database business solutions -=-=-=-=-=-=- |
|   http://www.eriscorp.commailto:[EMAIL PROTECTED]   |
|Midwest Regional Office: 815-547-0662 (voice)  815-547-0353 (Fax)|
.=.
---
u2-users mailing list
u2-users@listserver.u2ug.org
To unsubscribe please visit http://listserver.u2ug.org/


Re: [U2] [UD] Indexing 'Best Practices'

2006-08-02 Thread TPellitieri
Kate Stanton [EMAIL PROTECTED] wrote on 2 Aug 2006 18:11:11 +1200

 I would go for the simplest solution to operate and maintain,
 almost to hell with machine efficiency.  You (or anyone else
 trying to follow you) are probably the most critical and
 expensive resource.

I must disagree with your reasoning.  System response time is critical in a
customer service situation.  If a customer has to wait a noticeable time
because the programmer didn't make an effort to provide efficient code,
it's a waste of the customer's time - and I would suggest that a customer's
time is a company's most critical resource, and wasting it is your most
expensive mistake.

--Tom Pellitieri
  Century Equipment
  Toledo, Ohio
---
u2-users mailing list
u2-users@listserver.u2ug.org
To unsubscribe please visit http://listserver.u2ug.org/


RE: [U2] [UD] Indexing 'Best Practices'

2006-08-02 Thread Bob Woodward
Hi David,

Don't know if this is possible or not but couldn't you write your
COMPLETE records to a .HIST file, keeping the rest of the data in a
.CURR file.  Then have a PART file definition that binds the two parts
into what is the current filename?  You could then index and select off
the .CURR file and be able to process off either the PART filename or
the .CURR filename.

I'm not real familiar with PART files but it seems like it should work.
I'm not even sure if PART files are available in UD but thought the
suggestion was worth mentioning.

Good luck,

Bob Woodward
Senior Programmer/Analyst
Harbor Wholesale Grocery

 
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of David Wolverton
Sent: Wednesday, August 02, 2006 6:53 AM
To: u2-users@listserver.u2ug.org
Subject: RE: [U2] [UD] Indexing 'Best Practices' 

OK - sold!  Virtual field to move COMPLETE to NULL is the answer... As I
was
driving back to the hotel after doing that post last night, I got to
thinking that the Virtual Field computation would strictly be
CPU/Memory,
which means I can do A LOT of that for the cost of a single disk access.
And with the B+Tree design, we are doomed to at least TWO disk reads in
every case (unless you happen to hit the Root record!) - the first to
the
'lead' block, and then a read either 'left' or 'right' - and fresh from
sleep, I can see that as the index grew, that could turn into 3, 4, or
50
traversals (aka Disk reads) left or right to position the key on write.
The
leaf size is 4K - PERIOD. So every 100 or so records in a leaf, there
would
be some sort of shift of level push that would involve even more write
traffic.

More than likely I'll create a new dictionary that is a Virtual Field,
and
index that NO.NULLS and change the routines that need expedited selects
that
we were using the index for anyway - any report that would allow
COMPLETE
as an option will probably have to go through the whole file, so we may
have
to create 'versions' of the report - a faster Select Only From
non-COMPLETE Version that can use Indexes, and a slow, but inclusive
Can
Include COMPLETE version which will use the unindexed attribute.

Thanks for all who responded on and off list!  If anyone else knows
magic
that goes on behind the scenes on indexes, or has recommendations,
please
continue to chime in!
---
u2-users mailing list
u2-users@listserver.u2ug.org
To unsubscribe please visit http://listserver.u2ug.org/
---
u2-users mailing list
u2-users@listserver.u2ug.org
To unsubscribe please visit http://listserver.u2ug.org/


RE: [U2] [UD] Indexing 'Best Practices'

2006-08-02 Thread David Wolverton
UniData does not have the 'part' files like UniVerse - it would be cool -
but not a choice for us without 'doing it manually'.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Bob Woodward
Sent: Wednesday, August 02, 2006 11:11 AM
To: u2-users@listserver.u2ug.org
Subject: RE: [U2] [UD] Indexing 'Best Practices' 

Hi David,

Don't know if this is possible or not but couldn't you write your COMPLETE
records to a .HIST file, keeping the rest of the data in a .CURR file.  Then
have a PART file definition that binds the two parts into what is the
current filename?  snip
---
u2-users mailing list
u2-users@listserver.u2ug.org
To unsubscribe please visit http://listserver.u2ug.org/