Hi Dave, The issue is not in joining, Drill can join empty schemaless table (for example empty JSON file or empty directory). DRILL-4517 is exactly describes the issue. You can add your test case with data to that jira ticket.
Regarding workarounds, I am not aware of any. Kind regards Vitalii On Thu, May 24, 2018 at 5:19 AM Dave Challis <[email protected]> wrote: > We've got some processes that dump some reporting data as a bunch of > parquet files, then runs queries involving joins with those tables (i.e. we > have a main table which is always non-empty, then a number of link tables > which join against which can be empty). > > The Parquet files contain schema metadata, but some contain no row data. > > Trying to join against them in Drill using e.g. > > SELECT * > FROM dfs.`a.parquet` AS A > JOIN dfs.`b.parquet` AS B ON (A.id=B.id) > JOIN dfs.`c.parquet` AS C ON (A.id=C.id); > > Fails with: "SYSTEM ERROR: IllegalArgumentException: MinorFragmentId 0 has > no read entries assigned" if either b.parquet or c.parquet contain no rows. > > It looks like it might have been reported as an issue here > https://issues.apache.org/jira/browse/DRILL-4517 , but as it hasn't been > fixed since 2016, I'm wondering if there are any suggested workarounds for > the above, rather than waiting for a fix. > > In MySQL/Postgres etc., joining against empty tables is fine, so this > behaviour was a bit unexpected, and is a major blocker for a project I'm > using Drill for. > > Thanks, > Dave >
