[ 
https://issues.apache.org/jira/browse/DRILL-5204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers resolved DRILL-5204.
--------------------------------
    Resolution: Fixed

Not sure why this was not closed earlier. Feature has been checked into Master.

Set up the mock data source. Then:

{code}
SELECT id_i, name_s50 FROM `mock`.`customers_1M`
{code}

The column and table names are fictions. The important part is the suffix. For 
columns, "_i" means integer, "_sx" means a string of length x, and so on. For 
tables, "x" means x rows. "xK" means x thousand rows. "xM" means x million rows.

See the {{ExampleTest}} class for details.

> Extend mock data source to use table specs from SQL
> ---------------------------------------------------
>
>                 Key: DRILL-5204
>                 URL: https://issues.apache.org/jira/browse/DRILL-5204
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Tools, Build & Test
>    Affects Versions: 1.9.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>            Priority: Minor
>
> DRILL-5152 provided a simple way to generate mock data from SQL:
> {code}
> SELECT colName_type FROM `mock`.`tableName_size` ...
> {code}
> The fix in that release encoded types and record counts directly in the SQL, 
> which is very handy for many simple cases.
> The original mock data source has another feature: it lets you create 
> multiple mock blocks of data that can be read in multiple threads. Later 
> additions made it easy to repeat a column definition (to generate, say, a 
> table with 1000 columns), to choose the data generator class, etc. All of 
> this was available only when writing physical plans by hand and encoding the 
> definition in the sub scan for the mock data source.
> This enhancement extends the SQL feature to allow the definitions to appear 
> in a JSON file easily referenced from SQL. The JSON file must be somewhere on 
> the class path (typically in a resources directory.) Then:
> {code}
> SELECT red, blue, green FROM `mock`.`foo/colors.json` ...
> {code}
> Is interpreted to mean, "the file colors.json defines a mock data source, 
> perhaps with repeated columns, perhaps with multiple fragments. From that 
> mock data source, select the three columns red, blue and green."
> With this change, tests can include quite sophisticated mock data sources, 
> simplifying debugging of plans with multiple fragments and/or more complex 
> table structures.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to