[jira] [Commented] (DRILL-5258) Allow "extended" mock tables access from SQL queries
[ https://issues.apache.org/jira/browse/DRILL-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15946106#comment-15946106 ] Kunal Khatua commented on DRILL-5258: - Closing as no QA verification is required > Allow "extended" mock tables access from SQL queries > > > Key: DRILL-5258 > URL: https://issues.apache.org/jira/browse/DRILL-5258 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.10.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > Labels: ready-to-commit > Fix For: 1.10.0 > > > DRILL-5152 provided a simple way to generate sample data in SQL using a new, > simplified version of the mock data generator. This approach is very > convenient, but is inherently limited. For example, the limited syntax > available in SQL does not encoding much information about columns such as > repeat count, data generator or so on. The simple SQL approach does not allow > generating multiple groups of data. > However, all these features are present in the original mock data source via > a special JSON configuration file. Previously, only physical plans could > access that extended syntax. > This ticket requests a SQL interface to the extended mock data source: > {code} > SELECT * FROM `mock`.`example/mock-options.json` > {code} > Mock data source options are always stored as a JSON file. Since the existing > mock data generator for SQL never uses JSON files, a simple rule is that if > the table name ends in ".json" then it is a specification, else the > information is encoded in table and column names. > The format of the data generation syntax is documented in the mock data > source classes. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5258) Allow "extended" mock tables access from SQL queries
[ https://issues.apache.org/jira/browse/DRILL-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15945702#comment-15945702 ] Paul Rogers commented on DRILL-5258: Development-only issue, no QA verification needed. > Allow "extended" mock tables access from SQL queries > > > Key: DRILL-5258 > URL: https://issues.apache.org/jira/browse/DRILL-5258 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.10.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > Labels: ready-to-commit > Fix For: 1.10.0 > > > DRILL-5152 provided a simple way to generate sample data in SQL using a new, > simplified version of the mock data generator. This approach is very > convenient, but is inherently limited. For example, the limited syntax > available in SQL does not encoding much information about columns such as > repeat count, data generator or so on. The simple SQL approach does not allow > generating multiple groups of data. > However, all these features are present in the original mock data source via > a special JSON configuration file. Previously, only physical plans could > access that extended syntax. > This ticket requests a SQL interface to the extended mock data source: > {code} > SELECT * FROM `mock`.`example/mock-options.json` > {code} > Mock data source options are always stored as a JSON file. Since the existing > mock data generator for SQL never uses JSON files, a simple rule is that if > the table name ends in ".json" then it is a specification, else the > information is encoded in table and column names. > The format of the data generation syntax is documented in the mock data > source classes. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5258) Allow "extended" mock tables access from SQL queries
[ https://issues.apache.org/jira/browse/DRILL-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15892995#comment-15892995 ] ASF GitHub Bot commented on DRILL-5258: --- Github user asfgit closed the pull request at: https://github.com/apache/drill/pull/752 > Allow "extended" mock tables access from SQL queries > > > Key: DRILL-5258 > URL: https://issues.apache.org/jira/browse/DRILL-5258 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.10.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > Labels: ready-to-commit > Fix For: 1.10.0 > > > DRILL-5152 provided a simple way to generate sample data in SQL using a new, > simplified version of the mock data generator. This approach is very > convenient, but is inherently limited. For example, the limited syntax > available in SQL does not encoding much information about columns such as > repeat count, data generator or so on. The simple SQL approach does not allow > generating multiple groups of data. > However, all these features are present in the original mock data source via > a special JSON configuration file. Previously, only physical plans could > access that extended syntax. > This ticket requests a SQL interface to the extended mock data source: > {code} > SELECT * FROM `mock`.`example/mock-options.json` > {code} > Mock data source options are always stored as a JSON file. Since the existing > mock data generator for SQL never uses JSON files, a simple rule is that if > the table name ends in ".json" then it is a specification, else the > information is encoded in table and column names. > The format of the data generation syntax is documented in the mock data > source classes. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5258) Allow "extended" mock tables access from SQL queries
[ https://issues.apache.org/jira/browse/DRILL-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15883847#comment-15883847 ] ASF GitHub Bot commented on DRILL-5258: --- Github user sohami commented on the issue: https://github.com/apache/drill/pull/752 Thanks for the change. LGTM. +1 > Allow "extended" mock tables access from SQL queries > > > Key: DRILL-5258 > URL: https://issues.apache.org/jira/browse/DRILL-5258 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.10 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > Labels: ready-to-commit > Fix For: 1.10 > > > DRILL-5152 provided a simple way to generate sample data in SQL using a new, > simplified version of the mock data generator. This approach is very > convenient, but is inherently limited. For example, the limited syntax > available in SQL does not encoding much information about columns such as > repeat count, data generator or so on. The simple SQL approach does not allow > generating multiple groups of data. > However, all these features are present in the original mock data source via > a special JSON configuration file. Previously, only physical plans could > access that extended syntax. > This ticket requests a SQL interface to the extended mock data source: > {code} > SELECT * FROM `mock`.`example/mock-options.json` > {code} > Mock data source options are always stored as a JSON file. Since the existing > mock data generator for SQL never uses JSON files, a simple rule is that if > the table name ends in ".json" then it is a specification, else the > information is encoded in table and column names. > The format of the data generation syntax is documented in the mock data > source classes. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5258) Allow "extended" mock tables access from SQL queries
[ https://issues.apache.org/jira/browse/DRILL-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15883810#comment-15883810 ] ASF GitHub Bot commented on DRILL-5258: --- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/752#discussion_r103057839 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/mock/BooleanGen.java --- @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.mock; + +import java.util.Random; + +import org.apache.drill.exec.vector.BitVector; +import org.apache.drill.exec.vector.ValueVector; + +public class BooleanGen implements FieldGen { + + Random rand = new Random( ); + + @Override + public void setup(ColumnDef colDef) { } + + public int value( ) { --- End diff -- Fixed. > Allow "extended" mock tables access from SQL queries > > > Key: DRILL-5258 > URL: https://issues.apache.org/jira/browse/DRILL-5258 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.10 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > Fix For: 1.10 > > > DRILL-5152 provided a simple way to generate sample data in SQL using a new, > simplified version of the mock data generator. This approach is very > convenient, but is inherently limited. For example, the limited syntax > available in SQL does not encoding much information about columns such as > repeat count, data generator or so on. The simple SQL approach does not allow > generating multiple groups of data. > However, all these features are present in the original mock data source via > a special JSON configuration file. Previously, only physical plans could > access that extended syntax. > This ticket requests a SQL interface to the extended mock data source: > {code} > SELECT * FROM `mock`.`example/mock-options.json` > {code} > Mock data source options are always stored as a JSON file. Since the existing > mock data generator for SQL never uses JSON files, a simple rule is that if > the table name ends in ".json" then it is a specification, else the > information is encoded in table and column names. > The format of the data generation syntax is documented in the mock data > source classes. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5258) Allow "extended" mock tables access from SQL queries
[ https://issues.apache.org/jira/browse/DRILL-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15883809#comment-15883809 ] ASF GitHub Bot commented on DRILL-5258: --- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/752#discussion_r103057873 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/mock/MockGroupScanPOP.java --- @@ -75,20 +76,50 @@ */ private boolean extended; + private ScanStats scanStats = ScanStats.TRIVIAL_TABLE; @JsonCreator public MockGroupScanPOP(@JsonProperty("url") String url, - @JsonProperty("extended") Boolean extended, @JsonProperty("entries") List readEntries) { super((String) null); this.readEntries = readEntries; this.url = url; -this.extended = extended == null ? false : extended; + +// Compute decent row-count stats for this mock data source so that +// the planner is "fooled" into thinking that this operator wil do +// disk I/O. + +int rowCount = 0; +int rowWidth = 0; +for (MockScanEntry entry : readEntries) { + rowCount += entry.getRecords(); + int width = 0; + if (entry.getTypes() == null) { +width = 50; + } else { +for (MockColumn col : entry.getTypes()) { + int colWidth = 0; + if (col.getWidthValue() == 0) { +colWidth = TypeHelper.getSize(col.getMajorType()); + } else { +colWidth = col.getWidthValue(); + } + colWidth *= col.getRepeatCount(); + width += colWidth; +} + } + rowWidth = Math.max(rowWidth, width); --- End diff -- Revised names and added comments to make clear what's going on. > Allow "extended" mock tables access from SQL queries > > > Key: DRILL-5258 > URL: https://issues.apache.org/jira/browse/DRILL-5258 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.10 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > Fix For: 1.10 > > > DRILL-5152 provided a simple way to generate sample data in SQL using a new, > simplified version of the mock data generator. This approach is very > convenient, but is inherently limited. For example, the limited syntax > available in SQL does not encoding much information about columns such as > repeat count, data generator or so on. The simple SQL approach does not allow > generating multiple groups of data. > However, all these features are present in the original mock data source via > a special JSON configuration file. Previously, only physical plans could > access that extended syntax. > This ticket requests a SQL interface to the extended mock data source: > {code} > SELECT * FROM `mock`.`example/mock-options.json` > {code} > Mock data source options are always stored as a JSON file. Since the existing > mock data generator for SQL never uses JSON files, a simple rule is that if > the table name ends in ".json" then it is a specification, else the > information is encoded in table and column names. > The format of the data generation syntax is documented in the mock data > source classes. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5258) Allow "extended" mock tables access from SQL queries
[ https://issues.apache.org/jira/browse/DRILL-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15883811#comment-15883811 ] ASF GitHub Bot commented on DRILL-5258: --- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/752#discussion_r103058784 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/fn/interp/ExpressionInterpreterTest.java --- @@ -124,7 +125,7 @@ public void interpreterDateTest() throws Exception { final BitControl.PlanFragment planFragment = BitControl.PlanFragment.getDefaultInstance(); final QueryContextInformation queryContextInfo = planFragment.getContext(); final inttimeZoneIndex = queryContextInfo.getTimeZone(); -final org.joda.time.DateTimeZone timeZone = org.joda.time.DateTimeZone.forID(org.apache.drill.exec.expr.fn.impl.DateUtility.getTimeZone(timeZoneIndex)); +final DateTimeZone timeZone = DateTimeZone.forID(org.apache.drill.exec.expr.fn.impl.DateUtility.getTimeZone(timeZoneIndex)); --- End diff -- Original code, but fixed. > Allow "extended" mock tables access from SQL queries > > > Key: DRILL-5258 > URL: https://issues.apache.org/jira/browse/DRILL-5258 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.10 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > Fix For: 1.10 > > > DRILL-5152 provided a simple way to generate sample data in SQL using a new, > simplified version of the mock data generator. This approach is very > convenient, but is inherently limited. For example, the limited syntax > available in SQL does not encoding much information about columns such as > repeat count, data generator or so on. The simple SQL approach does not allow > generating multiple groups of data. > However, all these features are present in the original mock data source via > a special JSON configuration file. Previously, only physical plans could > access that extended syntax. > This ticket requests a SQL interface to the extended mock data source: > {code} > SELECT * FROM `mock`.`example/mock-options.json` > {code} > Mock data source options are always stored as a JSON file. Since the existing > mock data generator for SQL never uses JSON files, a simple rule is that if > the table name ends in ".json" then it is a specification, else the > information is encoded in table and column names. > The format of the data generation syntax is documented in the mock data > source classes. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5258) Allow "extended" mock tables access from SQL queries
[ https://issues.apache.org/jira/browse/DRILL-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15883812#comment-15883812 ] ASF GitHub Bot commented on DRILL-5258: --- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/752#discussion_r103058623 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/mock/package-info.java --- @@ -60,14 +62,26 @@ * The mode is one of the supported Drill * {@link DataMode} names: usually OPTIONAL or REQUIRED. * + * + * Recent extensions include: + * + * repeat in either the "entry" or "record" elements allow --- End diff -- Yes. Added the property to MockScanEntry. Need to add it to the implementation as well, which is planned, but not yet complete. > Allow "extended" mock tables access from SQL queries > > > Key: DRILL-5258 > URL: https://issues.apache.org/jira/browse/DRILL-5258 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.10 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > Fix For: 1.10 > > > DRILL-5152 provided a simple way to generate sample data in SQL using a new, > simplified version of the mock data generator. This approach is very > convenient, but is inherently limited. For example, the limited syntax > available in SQL does not encoding much information about columns such as > repeat count, data generator or so on. The simple SQL approach does not allow > generating multiple groups of data. > However, all these features are present in the original mock data source via > a special JSON configuration file. Previously, only physical plans could > access that extended syntax. > This ticket requests a SQL interface to the extended mock data source: > {code} > SELECT * FROM `mock`.`example/mock-options.json` > {code} > Mock data source options are always stored as a JSON file. Since the existing > mock data generator for SQL never uses JSON files, a simple rule is that if > the table name ends in ".json" then it is a specification, else the > information is encoded in table and column names. > The format of the data generation syntax is documented in the mock data > source classes. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5258) Allow "extended" mock tables access from SQL queries
[ https://issues.apache.org/jira/browse/DRILL-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15883813#comment-15883813 ] ASF GitHub Bot commented on DRILL-5258: --- Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/752#discussion_r103057899 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/mock/MockStorageEngine.java --- @@ -89,14 +85,30 @@ public boolean supportsRead() { return true; } -// public static class ImplicitTable extends DynamicDrillTable { -// -//public ImplicitTable(StoragePlugin plugin, String storageEngineName, -//Object selection) { -// super(plugin, storageEngineName, selection); -//} -// -// } + /** + * Resolves table names within the mock data source. Tables can be of two forms: + * + * _ + * + * Where the "name" can be anything, "n" is the number of rows, and "unit" is + * the units for the row count: non, K (thousand) or M (million). + * + * The above form generates a table directly with no other information needed. + * Column names must be provided, and must be of the form: + * + * _ + * + * Where the name can be anything, the type must be i (integer), d (double) + * or s (string, AKA VarChar). The length is needed only for string fields. --- End diff -- Fixed. > Allow "extended" mock tables access from SQL queries > > > Key: DRILL-5258 > URL: https://issues.apache.org/jira/browse/DRILL-5258 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.10 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > Fix For: 1.10 > > > DRILL-5152 provided a simple way to generate sample data in SQL using a new, > simplified version of the mock data generator. This approach is very > convenient, but is inherently limited. For example, the limited syntax > available in SQL does not encoding much information about columns such as > repeat count, data generator or so on. The simple SQL approach does not allow > generating multiple groups of data. > However, all these features are present in the original mock data source via > a special JSON configuration file. Previously, only physical plans could > access that extended syntax. > This ticket requests a SQL interface to the extended mock data source: > {code} > SELECT * FROM `mock`.`example/mock-options.json` > {code} > Mock data source options are always stored as a JSON file. Since the existing > mock data generator for SQL never uses JSON files, a simple rule is that if > the table name ends in ".json" then it is a specification, else the > information is encoded in table and column names. > The format of the data generation syntax is documented in the mock data > source classes. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5258) Allow "extended" mock tables access from SQL queries
[ https://issues.apache.org/jira/browse/DRILL-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877199#comment-15877199 ] ASF GitHub Bot commented on DRILL-5258: --- Github user sohami commented on a diff in the pull request: https://github.com/apache/drill/pull/752#discussion_r102294277 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/mock/BooleanGen.java --- @@ -0,0 +1,42 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.mock; + +import java.util.Random; + +import org.apache.drill.exec.vector.BitVector; +import org.apache.drill.exec.vector.ValueVector; + +public class BooleanGen implements FieldGen { + + Random rand = new Random( ); + + @Override + public void setup(ColumnDef colDef) { } + + public int value( ) { --- End diff -- Extra space between `()` > Allow "extended" mock tables access from SQL queries > > > Key: DRILL-5258 > URL: https://issues.apache.org/jira/browse/DRILL-5258 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.10 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > Fix For: 1.10 > > > DRILL-5152 provided a simple way to generate sample data in SQL using a new, > simplified version of the mock data generator. This approach is very > convenient, but is inherently limited. For example, the limited syntax > available in SQL does not encoding much information about columns such as > repeat count, data generator or so on. The simple SQL approach does not allow > generating multiple groups of data. > However, all these features are present in the original mock data source via > a special JSON configuration file. Previously, only physical plans could > access that extended syntax. > This ticket requests a SQL interface to the extended mock data source: > {code} > SELECT * FROM `mock`.`example/mock-options.json` > {code} > Mock data source options are always stored as a JSON file. Since the existing > mock data generator for SQL never uses JSON files, a simple rule is that if > the table name ends in ".json" then it is a specification, else the > information is encoded in table and column names. > The format of the data generation syntax is documented in the mock data > source classes. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5258) Allow "extended" mock tables access from SQL queries
[ https://issues.apache.org/jira/browse/DRILL-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877195#comment-15877195 ] ASF GitHub Bot commented on DRILL-5258: --- Github user sohami commented on a diff in the pull request: https://github.com/apache/drill/pull/752#discussion_r102320948 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/mock/MockStorageEngine.java --- @@ -89,14 +85,30 @@ public boolean supportsRead() { return true; } -// public static class ImplicitTable extends DynamicDrillTable { -// -//public ImplicitTable(StoragePlugin plugin, String storageEngineName, -//Object selection) { -// super(plugin, storageEngineName, selection); -//} -// -// } + /** + * Resolves table names within the mock data source. Tables can be of two forms: + * + * _ + * + * Where the "name" can be anything, "n" is the number of rows, and "unit" is + * the units for the row count: non, K (thousand) or M (million). + * + * The above form generates a table directly with no other information needed. + * Column names must be provided, and must be of the form: + * + * _ + * + * Where the name can be anything, the type must be i (integer), d (double) + * or s (string, AKA VarChar). The length is needed only for string fields. --- End diff -- how about boolean (b) as a type ? > Allow "extended" mock tables access from SQL queries > > > Key: DRILL-5258 > URL: https://issues.apache.org/jira/browse/DRILL-5258 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.10 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > Fix For: 1.10 > > > DRILL-5152 provided a simple way to generate sample data in SQL using a new, > simplified version of the mock data generator. This approach is very > convenient, but is inherently limited. For example, the limited syntax > available in SQL does not encoding much information about columns such as > repeat count, data generator or so on. The simple SQL approach does not allow > generating multiple groups of data. > However, all these features are present in the original mock data source via > a special JSON configuration file. Previously, only physical plans could > access that extended syntax. > This ticket requests a SQL interface to the extended mock data source: > {code} > SELECT * FROM `mock`.`example/mock-options.json` > {code} > Mock data source options are always stored as a JSON file. Since the existing > mock data generator for SQL never uses JSON files, a simple rule is that if > the table name ends in ".json" then it is a specification, else the > information is encoded in table and column names. > The format of the data generation syntax is documented in the mock data > source classes. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5258) Allow "extended" mock tables access from SQL queries
[ https://issues.apache.org/jira/browse/DRILL-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877198#comment-15877198 ] ASF GitHub Bot commented on DRILL-5258: --- Github user sohami commented on a diff in the pull request: https://github.com/apache/drill/pull/752#discussion_r102360734 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/mock/MockGroupScanPOP.java --- @@ -75,20 +76,50 @@ */ private boolean extended; + private ScanStats scanStats = ScanStats.TRIVIAL_TABLE; @JsonCreator public MockGroupScanPOP(@JsonProperty("url") String url, - @JsonProperty("extended") Boolean extended, @JsonProperty("entries") List readEntries) { super((String) null); this.readEntries = readEntries; this.url = url; -this.extended = extended == null ? false : extended; + +// Compute decent row-count stats for this mock data source so that +// the planner is "fooled" into thinking that this operator wil do +// disk I/O. + +int rowCount = 0; +int rowWidth = 0; +for (MockScanEntry entry : readEntries) { + rowCount += entry.getRecords(); + int width = 0; + if (entry.getTypes() == null) { +width = 50; + } else { +for (MockColumn col : entry.getTypes()) { + int colWidth = 0; + if (col.getWidthValue() == 0) { +colWidth = TypeHelper.getSize(col.getMajorType()); + } else { +colWidth = col.getWidthValue(); + } + colWidth *= col.getRepeatCount(); + width += colWidth; +} + } + rowWidth = Math.max(rowWidth, width); --- End diff -- `rowWidth` seems to be `maxRowWidth` and `width` is `rowWidth`. Can we please rename these ? > Allow "extended" mock tables access from SQL queries > > > Key: DRILL-5258 > URL: https://issues.apache.org/jira/browse/DRILL-5258 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.10 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > Fix For: 1.10 > > > DRILL-5152 provided a simple way to generate sample data in SQL using a new, > simplified version of the mock data generator. This approach is very > convenient, but is inherently limited. For example, the limited syntax > available in SQL does not encoding much information about columns such as > repeat count, data generator or so on. The simple SQL approach does not allow > generating multiple groups of data. > However, all these features are present in the original mock data source via > a special JSON configuration file. Previously, only physical plans could > access that extended syntax. > This ticket requests a SQL interface to the extended mock data source: > {code} > SELECT * FROM `mock`.`example/mock-options.json` > {code} > Mock data source options are always stored as a JSON file. Since the existing > mock data generator for SQL never uses JSON files, a simple rule is that if > the table name ends in ".json" then it is a specification, else the > information is encoded in table and column names. > The format of the data generation syntax is documented in the mock data > source classes. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5258) Allow "extended" mock tables access from SQL queries
[ https://issues.apache.org/jira/browse/DRILL-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877196#comment-15877196 ] ASF GitHub Bot commented on DRILL-5258: --- Github user sohami commented on a diff in the pull request: https://github.com/apache/drill/pull/752#discussion_r102298318 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/fn/interp/ExpressionInterpreterTest.java --- @@ -124,7 +125,7 @@ public void interpreterDateTest() throws Exception { final BitControl.PlanFragment planFragment = BitControl.PlanFragment.getDefaultInstance(); final QueryContextInformation queryContextInfo = planFragment.getContext(); final inttimeZoneIndex = queryContextInfo.getTimeZone(); -final org.joda.time.DateTimeZone timeZone = org.joda.time.DateTimeZone.forID(org.apache.drill.exec.expr.fn.impl.DateUtility.getTimeZone(timeZoneIndex)); +final DateTimeZone timeZone = DateTimeZone.forID(org.apache.drill.exec.expr.fn.impl.DateUtility.getTimeZone(timeZoneIndex)); --- End diff -- Please remove extra space after `=` > Allow "extended" mock tables access from SQL queries > > > Key: DRILL-5258 > URL: https://issues.apache.org/jira/browse/DRILL-5258 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.10 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > Fix For: 1.10 > > > DRILL-5152 provided a simple way to generate sample data in SQL using a new, > simplified version of the mock data generator. This approach is very > convenient, but is inherently limited. For example, the limited syntax > available in SQL does not encoding much information about columns such as > repeat count, data generator or so on. The simple SQL approach does not allow > generating multiple groups of data. > However, all these features are present in the original mock data source via > a special JSON configuration file. Previously, only physical plans could > access that extended syntax. > This ticket requests a SQL interface to the extended mock data source: > {code} > SELECT * FROM `mock`.`example/mock-options.json` > {code} > Mock data source options are always stored as a JSON file. Since the existing > mock data generator for SQL never uses JSON files, a simple rule is that if > the table name ends in ".json" then it is a specification, else the > information is encoded in table and column names. > The format of the data generation syntax is documented in the mock data > source classes. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5258) Allow "extended" mock tables access from SQL queries
[ https://issues.apache.org/jira/browse/DRILL-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877197#comment-15877197 ] ASF GitHub Bot commented on DRILL-5258: --- Github user sohami commented on a diff in the pull request: https://github.com/apache/drill/pull/752#discussion_r102336171 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/mock/package-info.java --- @@ -60,14 +62,26 @@ * The mode is one of the supported Drill * {@link DataMode} names: usually OPTIONAL or REQUIRED. * + * + * Recent extensions include: + * + * repeat in either the "entry" or "record" elements allow --- End diff -- I just found repeat definition in `MockColumn` but not in `MockScanEntry` whereas here in comment and example `example-mock.json` we are showing repeat property at entry level. Is this work in progress ? > Allow "extended" mock tables access from SQL queries > > > Key: DRILL-5258 > URL: https://issues.apache.org/jira/browse/DRILL-5258 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.10 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > Fix For: 1.10 > > > DRILL-5152 provided a simple way to generate sample data in SQL using a new, > simplified version of the mock data generator. This approach is very > convenient, but is inherently limited. For example, the limited syntax > available in SQL does not encoding much information about columns such as > repeat count, data generator or so on. The simple SQL approach does not allow > generating multiple groups of data. > However, all these features are present in the original mock data source via > a special JSON configuration file. Previously, only physical plans could > access that extended syntax. > This ticket requests a SQL interface to the extended mock data source: > {code} > SELECT * FROM `mock`.`example/mock-options.json` > {code} > Mock data source options are always stored as a JSON file. Since the existing > mock data generator for SQL never uses JSON files, a simple rule is that if > the table name ends in ".json" then it is a specification, else the > information is encoded in table and column names. > The format of the data generation syntax is documented in the mock data > source classes. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5258) Allow "extended" mock tables access from SQL queries
[ https://issues.apache.org/jira/browse/DRILL-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877194#comment-15877194 ] ASF GitHub Bot commented on DRILL-5258: --- Github user sohami commented on a diff in the pull request: https://github.com/apache/drill/pull/752#discussion_r102362622 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/mock/MockStorageEngine.java --- @@ -109,7 +121,37 @@ public MockSchema(MockStorageEngine engine) { @Override public Table getTable(String name) { - Pattern p = Pattern.compile("(\\w+)_(\\d+)(k|m)?", Pattern.CASE_INSENSITIVE); + if (name.toLowerCase().endsWith(".json") ) { +return getConfigFile(name); + } else { +return getDirectTable(name); + } +} + +private Table getConfigFile(String name) { + final URL url = Resources.getResource(name); + if (url == null) { +throw new IllegalArgumentException( +"Unable to find mock table config file " + name); + } + MockTableDef mockTableDefn; + try { +String json = Resources.toString(url, Charsets.UTF_8); +final ObjectMapper mapper = new ObjectMapper(); +mapper.configure(JsonParser.Feature.ALLOW_UNQUOTED_FIELD_NAMES, true); +mockTableDefn = mapper.readValue(json, MockTableDef.class); + } catch (JsonParseException e) { +throw new IllegalArgumentException( "Unable to parse mock table definition file: " + name, e ); + } catch (JsonMappingException e) { +throw new IllegalArgumentException( "Unable to Jackson deserialize mock table definition file: " + name, e ); + } catch (IOException e) { +throw new IllegalArgumentException( "Unable to read mock table definition file: " + name, e ); + } + return new DynamicDrillTable(engine, this.name, mockTableDefn.getEntries() ); --- End diff -- Please remove extra space before last` )`. there are few other places too like line 169 in this file, line 199 (MockTableDef.java), etc. > Allow "extended" mock tables access from SQL queries > > > Key: DRILL-5258 > URL: https://issues.apache.org/jira/browse/DRILL-5258 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.10 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > Fix For: 1.10 > > > DRILL-5152 provided a simple way to generate sample data in SQL using a new, > simplified version of the mock data generator. This approach is very > convenient, but is inherently limited. For example, the limited syntax > available in SQL does not encoding much information about columns such as > repeat count, data generator or so on. The simple SQL approach does not allow > generating multiple groups of data. > However, all these features are present in the original mock data source via > a special JSON configuration file. Previously, only physical plans could > access that extended syntax. > This ticket requests a SQL interface to the extended mock data source: > {code} > SELECT * FROM `mock`.`example/mock-options.json` > {code} > Mock data source options are always stored as a JSON file. Since the existing > mock data generator for SQL never uses JSON files, a simple rule is that if > the table name ends in ".json" then it is a specification, else the > information is encoded in table and column names. > The format of the data generation syntax is documented in the mock data > source classes. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (DRILL-5258) Allow "extended" mock tables access from SQL queries
[ https://issues.apache.org/jira/browse/DRILL-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15872710#comment-15872710 ] ASF GitHub Bot commented on DRILL-5258: --- GitHub user paul-rogers opened a pull request: https://github.com/apache/drill/pull/752 DRILL-5258: Access mock data definition from SQL Extends the mock data source to allow using the full power of the mock data source from an SQL query by referencing the JSON definition file. See JIRA and package-info for details. You can merge this pull request into a Git repository by running: $ git pull https://github.com/paul-rogers/drill DRILL-5258 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/752.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #752 commit eb9860d4365f60da3b3c22fc0f96a9acfd31ed5c Author: Paul Rogers Date: 2017-02-14T18:02:13Z DRILL-5258: Access mock data definition from SQL Extends the mock data source to allow using the full power of the mock data source from an SQL query by referencing the JSON definition file. See JIRA and package-info for details. > Allow "extended" mock tables access from SQL queries > > > Key: DRILL-5258 > URL: https://issues.apache.org/jira/browse/DRILL-5258 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.10 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > Fix For: 1.10 > > > DRILL-5152 provided a simple way to generate sample data in SQL using a new, > simplified version of the mock data generator. This approach is very > convenient, but is inherently limited. For example, the limited syntax > available in SQL does not encoding much information about columns such as > repeat count, data generator or so on. The simple SQL approach does not allow > generating multiple groups of data. > However, all these features are present in the original mock data source via > a special JSON configuration file. Previously, only physical plans could > access that extended syntax. > This ticket requests a SQL interface to the extended mock data source: > {code} > SELECT * FROM `mock`.`example/mock-options.json` > {code} > Mock data source options are always stored as a JSON file. Since the existing > mock data generator for SQL never uses JSON files, a simple rule is that if > the table name ends in ".json" then it is a specification, else the > information is encoded in table and column names. > The format of the data generation syntax is documented in the mock data > source classes. -- This message was sent by Atlassian JIRA (v6.3.15#6346)