[ https://issues.apache.org/jira/browse/DRILL-4211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Hamilton-Smith updated DRILL-4211: ----------------------------------------- Target Version/s: Future > Inconsistent results from a joined sql statement to postgres tables > ------------------------------------------------------------------- > > Key: DRILL-4211 > URL: https://issues.apache.org/jira/browse/DRILL-4211 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators > Affects Versions: 1.3.0 > Environment: Postgres db stroage > Reporter: Robert Hamilton-Smith > Labels: newbie > > When making an sql statement that incorporates a join to a table and then a > self join to that table to get a parent value , Drill brings back > inconsistent results. > Here is the sql in postgres with correct output: > {code:sql} > select trx.categoryguid, > cat.categoryname, w1.categoryname as parentcat > from transactions trx > join categories cat on (cat.CATEGORYGUID = trx.CATEGORYGUID) > join categories w1 on (cat.categoryparentguid = w1.categoryguid) > where cat.categoryparentguid IS NOT NULL; > {code} > Output: > ||categoryid||categoryname||parentcategory|| > |id1|restaurants|food&Dining| > |id1|restaurants|food&Dining| > |id2|Coffee Shops|food&Dining| > |id2|Coffee Shops|food&Dining| > When run in Drill with correct storage prefix: > {code:sql} > select trx.categoryguid, > cat.categoryname, w1.categoryname as parentcat > from db.schema.transactions trx > join db.schema.categories cat on (cat.CATEGORYGUID = trx.CATEGORYGUID) > join db.schema.wpfm_categories w1 on (cat.categoryparentguid = > w1.categoryguid) > where cat.categoryparentguid IS NOT NULL > {code} > Results are: > ||categoryid||categoryname||parentcategory|| > |id1|restaurants|null| > |id1|restaurants|null| > |id2|Coffee Shops|null| > |id2|Coffee Shops|null| > Physical plan is: > {code:sql} > 00-00 Screen : rowType = RecordType(VARCHAR(50) categoryguid, VARCHAR(50) > categoryname, VARCHAR(50) parentcat): rowcount = 100.0, cumulative cost = > {110.0 rows, 110.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 64293 > 00-01 Project(categoryguid=[$0], categoryname=[$1], parentcat=[$2]) : > rowType = RecordType(VARCHAR(50) categoryguid, VARCHAR(50) categoryname, > VARCHAR(50) parentcat): rowcount = 100.0, cumulative cost = {100.0 rows, > 100.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 64292 > 00-02 Project(categoryguid=[$9], categoryname=[$41], parentcat=[$47]) > : rowType = RecordType(VARCHAR(50) categoryguid, VARCHAR(50) categoryname, > VARCHAR(50) parentcat): rowcount = 100.0, cumulative cost = {100.0 rows, > 100.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 64291 > 00-03 Jdbc(sql=[SELECT * > FROM "public"."transactions" > INNER JOIN (SELECT * > FROM "public"."categories" > WHERE "categoryparentguid" IS NOT NULL) AS "t" ON > "transactions"."categoryguid" = "t"."categoryguid" > INNER JOIN "public"."categories" AS "categories0" ON "t"."categoryparentguid" > = "categories0"."categoryguid"]) : rowType = RecordType(VARCHAR(255) > transactionguid, VARCHAR(255) relatedtransactionguid, VARCHAR(255) > transactioncode, DECIMAL(1, 0) transactionpending, VARCHAR(50) > transactionrefobjecttype, VARCHAR(255) transactionrefobjectguid, > VARCHAR(1024) transactionrefobjectvalue, TIMESTAMP(6) transactiondate, > VARCHAR(256) transactiondescription, VARCHAR(50) categoryguid, VARCHAR(3) > transactioncurrency, DECIMAL(15, 3) transactionoldbalance, DECIMAL(13, 3) > transactionamount, DECIMAL(15, 3) transactionnewbalance, VARCHAR(512) > transactionnotes, DECIMAL(2, 0) transactioninstrumenttype, VARCHAR(20) > transactioninstrumentsubtype, VARCHAR(20) transactioninstrumentcode, > VARCHAR(50) transactionorigpartyguid, VARCHAR(255) > transactionorigaccountguid, VARCHAR(50) transactionrecpartyguid, VARCHAR(255) > transactionrecaccountguid, VARCHAR(256) transactionstatementdesc, DECIMAL(1, > 0) transactionsplit, DECIMAL(1, 0) transactionduplicated, DECIMAL(1, 0) > transactionrecategorized, TIMESTAMP(6) transactioncreatedat, TIMESTAMP(6) > transactionupdatedat, VARCHAR(50) transactionmatrulerefobjtype, VARCHAR(50) > transactionmatrulerefobjguid, VARCHAR(50) transactionmatrulerefobjvalue, > VARCHAR(50) transactionuserruleguid, DECIMAL(2, 0) transactionsplitorder, > TIMESTAMP(6) transactionprocessedat, TIMESTAMP(6) > transactioncategoryassignat, VARCHAR(50) transactionsystemcategoryguid, > VARCHAR(50) transactionorigmandateid, VARCHAR(100) fingerprint, VARCHAR(50) > categoryguid0, VARCHAR(50) categoryparentguid, DECIMAL(3, 0) categorytype, > VARCHAR(50) categoryname, VARCHAR(50) categorydescription, VARCHAR(50) > partyguid, VARCHAR(50) categoryguid1, VARCHAR(50) categoryparentguid0, > DECIMAL(3, 0) categorytype0, VARCHAR(50) categoryname0, VARCHAR(50) > categorydescription0, VARCHAR(50) partyguid0): rowcount = 100.0, cumulative > cost = {100.0 rows, 100.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 64259 > {code} > I worked around it by creating a view on postgres but not ideal. Thanks in > advance. > First Drill Jira Bug. -- This message was sent by Atlassian JIRA (v6.3.4#6332)