Agreed, that this is how it is often done.
But there is nothing that says that this is how it has to be done OR
even how it should be done.

These 2 files could be setup such that :

  The CUSTOMER file contains no such embedded 'cross reference' field
for appointments

  The APPOINTMENTS file could be keyed as :
        @ID =  CUSTOMERNO*APPTDATE*APPTTIME*APPTTYPE
     With ITYPE entries setup for 
        CUSTOMERNO = @ID['*',1,1]
        APPTDATE = @ID['*',2,1]
        APPTTIME = @ID['*',3,1]
        APPTYPE = @ID['*',4,1]

    -- OR --
        @ID =  sequentially generated 
     With data fields defined for  CUSTOMERNO, APPTDATE, APPTTIME &
APPTYPE

    In both cases, indexed on CUSTOMERNO.


I believe you will find the following 2 snippets pretty comparable
performance wise :

        * I know there are other better (?) ways to do this
        * insert your own list traversal code
        appcnt=dcount(custrec<appxref>,@VM )
        for appidx = 1 to appcnt
                appid = custrec<appxref,appidx>
        next

        execute 'SELECT APPOINTMENTS WITH CUSTOMERNO = ':squote(custno)
        loop while readnext appid
        repeat

Dropping the xref field in the parent table would help immensely with
file sizing as records would tend to be a lot closer in size to the
average instead of having CUSTOMER records that have accumulate massive
xref fields over time - this is a problem I see in systems all the time.
This also gets around your problem of knowing what is related to what -
you know that all keys are automatic indexes in sql server and anything
that could be used for lookup/joining purposes should also have an
explicit index in sql server else it would be a bottleneck in sql server
as well.

There is also nothing that says that a file's key value has to be based
on anything related to the data in the record - some would argue that
key values should NEVER be related to the data.  The only reason to
prefer the 1st method over the 2nd is for human readability which imo
shouldn't be a concern anyway.
The 2nd scheme also gets around the sql->mv field mapping, no defined
relation, no key, non-unique key and key length max exceeded problems,
and would definitely produce a much more even hash distribution.



Just some thoughts,  Gerry







-----Original Message-----
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Robert Houben
Sent: December 24, 2010 01:41 AM
To: U2 Users List
Subject: Re: [U2] Migration

Oh, one more point.  What if your SQL environment had NOT defined a
primary key for APPOINTMENTS, but had multiple indexes, one of which
happened to have CUSTOMERNO, APPTDATE, APPTTIME and APPTTYPE.  How would
you figure out what to use as the item-id of the PICK file?

What if you had a SQL table that actually did not have a set of fields
that guaranteed a unique value? Then you have NOTHING to create an
item-id from!

I have to stop this, it will consume me! :o  But the list goes on.  Oh
the humanity!

-----Original Message-----
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Robert Houben
Sent: Thursday, December 23, 2010 10:36 PM
To: U2 Users List
Subject: Re: [U2] Migration

I may have been unclear in my earlier post, so I'll clarify.

Consider a CUSTOMER file and an APPOINTMENTS file.  The item-id of the
CUSTOMER file is the customer number.  The item-id of the APPOINTMENTS
file is CUSTOMERNO*APPTDATE*APPTTIME*APPTTYPE.

When you have a parent/child relationship in multiple files in MV,
except in those cases where the key to a child file is the item-id of
the parent file with a sequential ordinal, the common way to indicate a
parent-child relationship involving more than one file in MV is to
embed, in the parent item a multivalued set of either all the entire
item-ids of the child table, or the portion that needs to be
concatenated to the parent's item-id.  That is what I call a cross
reference field.  If you don't have this, you are faced with trying to
scan the whole APPOINTMENTS file to find all item-ids that start with
your CUSTOMERNO value.  What you might actually have in the CUSTOMER
file is a set of 3 correlated multivalued attributes that have APPTDATE,
APPTTIME, and APPTTYPE values for all the APPOINTMENTS items that
pertain to the CUSTOMER item.

In a SQL environment, the primary key to the child table would consist
of at least two fields, one or more of which would be the full primary
key of the parent table.  In SQL Server a true primary key forces the
file to actually be sorted by those key fields (it forces a clustered
index).  You can also have secondary indexes that are also pre-sorted by
their indexed columns.  They are effectively complete copies of the
indexed fields and a copy of the primary key so it can directly read the
data once you've found the index entries that match your query.   In our
example above, you'd have CUSTOMERNO as a primary key to the CUSTOMER
table, and 4 separate fields (no "*" delimiter) that make up the primary
key of the APPOINTMENTS table.

I'm not exactly sure how you'd accomplish the same thing in a
MultiValued environment if you just copied all the tables as flat MV
files.  You'd lose the ability to access the child records without doing
a complete table scan.  Unless of course you analyzed the data, and
created some special linking files, but the objective seems to be to
avoid human intervention.  You might actually be able to do something
with an MV index on a dictionary record that references just the portion
of the child file's item-id that makes up the parent file's item-id (in
our example, the CUSTOMERNO).  But you'd have to add that, at the very
least, and that would mean creating a dictionary record as part of what
you create.  And that's NOT how you'd do it in SQL Server, for instance.

You might have defined foreign key references in the child table, but
that's not a given, so how you'd even know there was a parent-child
relationship in place is not clear.  In some cases the naming of the
keys in the files can give you a hint, (that's how MS Access always
tried to figure it out and it worked a surprising amount of the time),
but you are not guaranteed that this will work in all cases.  I've seen
plenty of cases where this did not work.

So, you could, in theory get all the data over, but you'd still be faced
with making it usable in a truly performant way.

Here's another gotcha to consider:

In many SQL tables, including our above example, the primary key will
consist of several fields.  What do you do in PICK where you have one,
and only one, item-id attribute?  Do you concatenate and assume fixed
length, or do you concatenate, choose a separator character and pray
it's not contained in the data?  What if the concatenation of these
fields exceeds the length limit of an MV item-id? I know of real world
applications where this is the case... There are workarounds, but they
are NOT automatic!

You will have to choose some rules to work around issues. You will hit
exceptions that you'll actually have to think about, and you may have to
redesign some structures, to make them practical.

In short, there are things that you would do in a relational environment
that don't really have an exact analog that works in the MV world, and
vice versa.

There, that's a bit more encompassing. I'm out of time, but I hope this
has been helpful.

-----Original Message-----
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of
fft2...@aol.com
Sent: Thursday, December 23, 2010 8:46 PM
To: u2-users@listserver.u2ug.org
Subject: Re: [U2] Migration

In a message dated 12/23/2010 4:28:38 PM Pacific Standard Time,
antli...@youngman.org.uk writes:


> > SQL uses indexes.  MV uses cross references to item-ids (MV
> > sometimes
> supports indexes, but they don't always work as well as in the
> relational
> world.)
> >
>
> I don't know as that is true ... or are you using the word "index" to
> mean something completely different to me? I'll agree the
> implementation of indices can be buggy, but surely that's true of
relational engines too?
>

I'm not quite sure I'm confortable with the idea (expressed in the
prior-prior posting of which I here quote and enquote the reponse) that
MV uses cross-references to item-ids.

To me a hash table, isn't the same thing as a cross-reference which
sounds a lot like a secondary key.  Hashing calculates an exact jump
point at which a group of related records are kept.  They are related by
having the same hash value.  But the hash value itself isn't looked up,
it's a calculation.

I wonder if you can setup a first normal form table in such a way, that
it maintains a constant sorted order ?  Sorting on the primary key,
would then be merely display time bound, there is no effort to it.  I
suppose you could even pick up and lay down the database periodically so
the sort order matches the actual disk layout.  Pick could never do
something like that.  There is always going to be effort involved in any
sorting, even if you're simply traversing the index tree and grabbing
the underlying data records.

But back to reality, I don't think SQL works this way anyway.  The
perception that it sorts much faster is probably related more closely to
the horsepower behind the scenes.  Pick systems tend to be installed on
slower systems because they are so efficient and most users are cheap
with their database, and expensive with their graphics.  So they install
the SQL type databases on speedy machines.
_______________________________________________
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users
_______________________________________________
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users
_______________________________________________
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


_______________________________________________
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

Reply via email to