On Jun 15, 2007, at 5:27 PM, Scott Hess wrote:


select tableA.path, tableA.value from tableA,tableB where
tableA.path=tableB.path and tableA.value=tableB.value union
select tableA.path, tableA.value from tableA,tableB where
tableA.path=tableB.path and tableA.value IS NULL AND tableB.value IS
NULL;



I think I oversimplified the problem a little bit. In the example I gave, there were just two columns. In the problem I have to solve, there are n columns,
and any item in any column might be NULL.  The above approach will work
well with just two columns, but with multiple columns you would have to
take an iterative approach; building intermediate results and intersecting
with the first column.

What I am trying to do is build efficient set operations. I have very large datasets and have found that the built in union, intersect, and except operations do not seem to use indices (would be happy to be proven wrong here). As such, they
are not very speedy with large large data sets.

If you really can treat null as '', then you might be better off
defining the column as NOT NULL DEFAULT ''.

I am leaning toward avoid NULL and using some other token which I can interpret
as "no data".  Not an ideal solution, but should scale.

There was talk in the mailing list a while back about creating a new operator that would act as a superset of '==' which would treat NULL==NULL as True. I have seen this in some other database. Anybody know if this is on the roadmap?



-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

Reply via email to