Re: [U2] Migration

Robert Houben Thu, 23 Dec 2010 22:40:22 -0800

Oh, one more point.  What if your SQL environment had NOT defined a primary key 
for APPOINTMENTS, but had multiple indexes, one of which happened to have 
CUSTOMERNO, APPTDATE, APPTTIME and APPTTYPE.  How would you figure out what to 
use as the item-id of the PICK file?

What if you had a SQL table that actually did not have a set of fields that 
guaranteed a unique value? Then you have NOTHING to create an item-id from!

I have to stop this, it will consume me! :o  But the list goes on.  Oh the 
humanity!

-----Original Message-----
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Robert Houben
Sent: Thursday, December 23, 2010 10:36 PM
To: U2 Users List
Subject: Re: [U2] Migration

I may have been unclear in my earlier post, so I'll clarify.

Consider a CUSTOMER file and an APPOINTMENTS file.  The item-id of the CUSTOMER 
file is the customer number.  The item-id of the APPOINTMENTS file is 
CUSTOMERNO*APPTDATE*APPTTIME*APPTTYPE.

When you have a parent/child relationship in multiple files in MV, except in 
those cases where the key to a child file is the item-id of the parent file 
with a sequential ordinal, the common way to indicate a parent-child 
relationship involving more than one file in MV is to embed, in the parent item 
a multivalued set of either all the entire item-ids of the child table, or the 
portion that needs to be concatenated to the parent's item-id.  That is what I 
call a cross reference field.  If you don't have this, you are faced with 
trying to scan the whole APPOINTMENTS file to find all item-ids that start with 
your CUSTOMERNO value.  What you might actually have in the CUSTOMER file is a 
set of 3 correlated multivalued attributes that have APPTDATE, APPTTIME, and 
APPTTYPE values for all the APPOINTMENTS items that pertain to the CUSTOMER 
item.

In a SQL environment, the primary key to the child table would consist of at 
least two fields, one or more of which would be the full primary key of the 
parent table.  In SQL Server a true primary key forces the file to actually be 
sorted by those key fields (it forces a clustered index).  You can also have 
secondary indexes that are also pre-sorted by their indexed columns.  They are 
effectively complete copies of the indexed fields and a copy of the primary key 
so it can directly read the data once you've found the index entries that match 
your query.   In our example above, you'd have CUSTOMERNO as a primary key to 
the CUSTOMER table, and 4 separate fields (no "*" delimiter) that make up the 
primary key of the APPOINTMENTS table.

I'm not exactly sure how you'd accomplish the same thing in a MultiValued 
environment if you just copied all the tables as flat MV files.  You'd lose the 
ability to access the child records without doing a complete table scan.  
Unless of course you analyzed the data, and created some special linking files, 
but the objective seems to be to avoid human intervention.  You might actually 
be able to do something with an MV index on a dictionary record that references 
just the portion of the child file's item-id that makes up the parent file's 
item-id (in our example, the CUSTOMERNO).  But you'd have to add that, at the 
very least, and that would mean creating a dictionary record as part of what 
you create.  And that's NOT how you'd do it in SQL Server, for instance.

You might have defined foreign key references in the child table, but that's 
not a given, so how you'd even know there was a parent-child relationship in 
place is not clear.  In some cases the naming of the keys in the files can give 
you a hint, (that's how MS Access always tried to figure it out and it worked a 
surprising amount of the time), but you are not guaranteed that this will work 
in all cases.  I've seen plenty of cases where this did not work.

So, you could, in theory get all the data over, but you'd still be faced with 
making it usable in a truly performant way.

Here's another gotcha to consider:

In many SQL tables, including our above example, the primary key will consist 
of several fields.  What do you do in PICK where you have one, and only one, 
item-id attribute?  Do you concatenate and assume fixed length, or do you 
concatenate, choose a separator character and pray it's not contained in the 
data?  What if the concatenation of these fields exceeds the length limit of an 
MV item-id? I know of real world applications where this is the case... There 
are workarounds, but they are NOT automatic!

You will have to choose some rules to work around issues. You will hit 
exceptions that you'll actually have to think about, and you may have to 
redesign some structures, to make them practical.

In short, there are things that you would do in a relational environment that 
don't really have an exact analog that works in the MV world, and vice versa.

There, that's a bit more encompassing. I'm out of time, but I hope this has 
been helpful.

-----Original Message-----
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of fft2...@aol.com
Sent: Thursday, December 23, 2010 8:46 PM
To: u2-users@listserver.u2ug.org
Subject: Re: [U2] Migration

In a message dated 12/23/2010 4:28:38 PM Pacific Standard Time, 
antli...@youngman.org.uk writes:

> > SQL uses indexes.  MV uses cross references to item-ids (MV
> > sometimes
> supports indexes, but they don't always work as well as in the
> relational
> world.)
> >
>
> I don't know as that is true ... or are you using the word "index" to
> mean something completely different to me? I'll agree the
> implementation of indices can be buggy, but surely that's true of relational 
> engines too?
>

I'm not quite sure I'm confortable with the idea (expressed in the prior-prior 
posting of which I here quote and enquote the reponse) that MV uses 
cross-references to item-ids.

To me a hash table, isn't the same thing as a cross-reference which sounds a 
lot like a secondary key.  Hashing calculates an exact jump point at which a 
group of related records are kept.  They are related by having the same hash 
value.  But the hash value itself isn't looked up, it's a calculation.

I wonder if you can setup a first normal form table in such a way, that it 
maintains a constant sorted order ?  Sorting on the primary key, would then be 
merely display time bound, there is no effort to it.  I suppose you could even 
pick up and lay down the database periodically so the sort order matches the 
actual disk layout.  Pick could never do something like that.  There is always 
going to be effort involved in any sorting, even if you're simply traversing 
the index tree and grabbing the underlying data records.

But back to reality, I don't think SQL works this way anyway.  The perception 
that it sorts much faster is probably related more closely to the horsepower 
behind the scenes.  Pick systems tend to be installed on slower systems because 
they are so efficient and most users are cheap with their database, and 
expensive with their graphics.  So they install the SQL type databases on 
speedy machines.
_______________________________________________
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users
_______________________________________________
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users
_______________________________________________
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

Re: [U2] Migration

Reply via email to