On Tuesday, March 03, 2015 10:50 AM, Dan Kennedy wrote:
> On 03/03/2015 03:57 PM, Andy Gibbs wrote:
>> Hi,
>>
>> The following is a reduced test-case of a problem I have observed:
>>
>> CREATE TABLE a(tid,pid);
>> CREATE TABLE b(tid,pname);
>> CREATE TEMP TABLE pidmap(newpid INTEGER,oldpid INTEGER);
>> CREATE TEMP TABLE pnamemap(pid INTEGER,pname TEXT COLLATE BINARY);
>>
>> (Please note that the tables are usually populated with data.)
>>
>> SELECT a.ROWID,b.ROWID FROM a
>>  INNER JOIN b ON a.tid=b.tid
>>              AND a.pid=(SELECT pid FROM pidmap WHERE pname=b.pname);
>>
>> The same problem occurs with a similar query:
>>
>> SELECT a.ROWID,b.ROWID FROM a
>>  INNER JOIN b ON a.tid=b.tid
>>  WHERE a.pid=(SELECT pid FROM pidmap WHERE pname=b.pname);
>>
>> The problem comes from an unintentional mistake: using the table
>> pidmap in the sub-query, rather than pnamemap.
>>
>> Running the sub-query on its own gives (as expected):
>>
>> SELECT pid FROM pidmap;
>> Error: no such column: pid
>>
>> This also results in an error:
>>
>> SELECT a.ROWID,b.ROWID FROM a
>>  INNER JOIN b ON a.tid=b.tid
>>              AND a.pid=(SELECT xyz FROM pidmap WHERE pname=b.pname);
>> Error: no such column: xyz
>>
>> And, of course, correcting the original query works as expected:
>>
>> SELECT a.ROWID,b.ROWID FROM a
>>  INNER JOIN b ON a.tid=b.tid
>>              AND a.pid=(SELECT pid FROM pnamemap WHERE pname=b.pname);
>>
>> Seems to me that sqlite is resolving the column name from the incorrect
>> scope in the sub-query?
>>
>> I have observed this in v3.7.10, but have also tested it in v3.8.8.3.
> 
> I think it's correct, no? Since there is no column "pid" to match 
> against in the sub-query, the "pid" within the sub-query refers to 
> "a.pid" from the outer query.
> 
>   http://en.wikipedia.org/wiki/Correlated_subquery

I am certainly no SQL language guru, but I should have thought that the
logic should work this way... (taking just the sub-query here)

SELECT pid FROM pidmap WHERE pname=b.pname

  -- the referenced table in "FROM" is unambiguously pidmap since there
     is no further join, etc. in this sub-query
  -- the column list should (IMHO) apply to this table alone, i.e. "pid"
     here, unless fully qualified, causing an error if the column doesn't
     exist in pidmap
  -- in the "WHERE" clause, a more relaxed lookup can be performed and
     include references to the outer query.

In the same way that SQL allows the disambiguation "b.pname" in the
"WHERE" clause, if the user wished to do something like this...

SELECT a.pid FROM pidmap WHERE pname=b.pname

... then this should work - and does even though I can't think of a use-
case for this!

Of course, by extension, this statement gives the expected error:

SELECT a.ROWID,b.ROWID FROM a
  INNER JOIN b ON a.tid=b.tid
              AND a.pid=(SELECT c.pid FROM pidmap AS c
                           WHERE c.pname=b.pname);

but to me this is counter-intuitive and needlessly verbose -- it is
"clear" from the original query that "SELECT pid FROM pidmap" expects
pid to be found in pidmap.  (Clear is in quotes since it may not be
clear to the database engine although it may be clear to user!)

Even if it is legal SQL (again I'm not an expert!) then I think it is
at best ambiguous, and perhaps there could be an advantage here for a
shell or pragma option that can highlight likely typos such as that
detailed here, i.e. to provide a warning message if not an error.

Andy

Reply via email to