[jira] [Updated] (DRILL-6667) Include internal data sets in documentation Sample Datasets
[ https://issues.apache.org/jira/browse/DRILL-6667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers updated DRILL-6667: --- Description: The Drill documentation provides the "Sample Datasets" section, which is very handy. However, this section does not discuss the two datasets provided with Drill itself. * Julian Hyde's [FoodMart data set|https://github.com/julianhyde/foodmart-data-hsqldb], available on the class path. * TPC-H data set, available on the class path in {{tpch}} The "FoodMart" data set is available directly under {{cp}}. In fact, the Drill sample query (see below) references a FoodMart table. To see the list of tables (at development time), find the {{foodmark-data-json-0.4.jar}} file in the Maven dependencies for {{drill-java-exec}}. The table names here are simplified relative to those in the ER diagram in the above link. Perhaps include a simple table with names, and the mapping to the original names, and a link to (or just embed the link) to the FoodMart ER image. The data is available in JSON format. TPCH data is available in {{`cp`.`tpch/*.parquet`}}, in Parquet format. The schema is described in the [TPC-H specification|http://www.tpc.org/tpc_documents_current_versions/current_specifications.asp]. These are very handy, but hard to find: I find I must keep searching the source code to remember file names and directory paths. End uses won't have this luxury. Suggestion: Describe the files available in the class path data source. Along these same lines, in "Connect a Data Source", there is no mention of the class path data source. Yet, we reference that data source in the Web Console where we suggest a sample query to run: {code} Sample SQL query: SELECT * FROM cp.`employee.json` LIMIT 20 {code} The above query refers to the FoodMart data set. was: The Drill documentation provides the "Sample Datasets" section, which is very handy. However, this section does not discuss the two datasets provided with Drill itself. * Julian Hyde's [FoodMart data set|https://github.com/julianhyde/foodmart-data-hsqldb], available on the class path. * TPC-H data set, available on the class path in {{tpch}} The "FoodMart" data set is available directly under {{cp}}. In fact, the Drill sample query (see below) references a FoodMart table. To see the list of tables (at development time), find the {{foodmark-data-json-0.4.jar}} file in the Maven dependencies for {{drill-java-exec}}. The table names here are simplified relative to those in the ER diagram in the above link. Perhaps include a simple table with names, and the mapping to the original names, and a link to (or just embed the link) to the FoodMart ER image. The data is available in JSON format. TPCH data is available in {{`cp`.`tpch/*.parquet`}}, in Parquet format. The schema is described in the [TPC-H specification|http://www.tpc.org/tpc_documents_current_versions/current_specifications.asp]. Suggestion: Describe the files available in the class path data source. Along these same lines, in "Connect a Data Source", there is no mention of the class path data source. Yet, we reference that data source in the Web Console where we suggest a sample query to run: {code} Sample SQL query: SELECT * FROM cp.`employee.json` LIMIT 20 {code} The above query refers to the FoodMart data set. > Include internal data sets in documentation Sample Datasets > --- > > Key: DRILL-6667 > URL: https://issues.apache.org/jira/browse/DRILL-6667 > Project: Apache Drill > Issue Type: Improvement > Components: Documentation >Affects Versions: 1.13.0 >Reporter: Paul Rogers >Assignee: Bridget Bevens >Priority: Minor > > The Drill documentation provides the "Sample Datasets" section, which is very > handy. However, this section does not discuss the two datasets provided with > Drill itself. > * Julian Hyde's [FoodMart data > set|https://github.com/julianhyde/foodmart-data-hsqldb], available on the > class path. > * TPC-H data set, available on the class path in {{tpch}} > The "FoodMart" data set is available directly under {{cp}}. In fact, the > Drill sample query (see below) references a FoodMart table. To see the list > of tables (at development time), find the {{foodmark-data-json-0.4.jar}} file > in the Maven dependencies for {{drill-java-exec}}. The table names here are > simplified relative to those in the ER diagram in the above link. Perhaps > include a simple table with names, and the mapping to the original names, and > a link to (or just embed the link) to the FoodMart ER image. The data is > available in JSON format. > TPCH data is available in {{`cp`.`tpch/*.parquet`}}, in Parquet format. The > schema is described in the [TPC-H > specification|http://www.tpc.org/tpc_documents_current
[jira] [Updated] (DRILL-6667) Include internal data sets in documentation Sample Datasets
[ https://issues.apache.org/jira/browse/DRILL-6667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers updated DRILL-6667: --- Description: The Drill documentation provides the "Sample Datasets" section, which is very handy. However, this section does not discuss the two datasets provided with Drill itself. * Julian Hyde's [FoodMart data set|https://github.com/julianhyde/foodmart-data-hsqldb], available on the class path. * TPC-H data set, available on the class path in {{tpch}} The "FoodMart" data set is available directly under {{cp}}. In fact, the Drill sample query (see below) references a FoodMart table. To see the list of tables (at development time), find the {{foodmark-data-json-0.4.jar}} file in the Maven dependencies for {{drill-java-exec}}. The table names here are simplified relative to those in the ER diagram in the above link. Perhaps include a simple table with names, and the mapping to the original names, and a link to (or just embed the link) to the FoodMart ER image. The data is available in JSON format. TPCH data is available in {{`cp`.`tpch/*.parquet`}}, in Parquet format. The schema is described in the [TPC-H specification|http://www.tpc.org/tpc_documents_current_versions/current_specifications.asp]. Suggestion: Describe the files available in the class path data source. Along these same lines, in "Connect a Data Source", there is no mention of the class path data source. Yet, we reference that data source in the Web Console where we suggest a sample query to run: {code} Sample SQL query: SELECT * FROM cp.`employee.json` LIMIT 20 {code} The above query refers to the FoodMart data set. was: The Drill documentation provides the "Sample Datasets" section, which is very handy. However, this section does not discuss the two datasets provided with Drill itself. * Julian Hyde's [FoodMart data set|https://github.com/julianhyde/foodmart-data-hsqldb], available on the class path. * TPC-H data set, available on the class path in {{tpch}} The "FoodMart" data set is available directly under {{cp}}. In fact, the Drill sample query (see below) references a FoodMart table. To see the list of tables (at development time), find the {{foodmark-data-json-0.4.jar}} file in the Maven dependencies for {{drill-java-exec}}. The table names here are simplified relative to those in the ER diagram in the above link. Perhaps include a simple table with names, and the mapping to the original names, and a link to (or just embed the link) to the FoodMart ER image. The data is available in JSON format. TPCH data is available in {{`cp`.`tpch/*.parquet`}}, in Parquet format. The schema is described in the [TPC-H specification|http://www.tpc.org/tpc_documents_current_versions/current_specifications.asp]. Further, in the "Tutorials" section, "Analyzing the Yelp Academic Dataset", we mention the Yelp data set. But, we don't mention that in the "Sample Datasets" section. We should, just to be consistent and to save the reader time when going back and saying, "Hey, didn't Drill provide some kind of Yelp data? Let me look in Sample Datasets. Wait.. no Yelp?" These are very handy, but hard to find: I find I must keep searching the source code to remember file names and directory paths. End uses won't have this luxury. Suggestion: Describe the files available in the class path data source. Along these same lines, in "Connect a Data Source", there is no mention of the class path data source. Yet, we reference that data source in the Web Console where we suggest a sample query to run: {code} Sample SQL query: SELECT * FROM cp.`employee.json` LIMIT 20 {code} The above query refers to the FoodMart data set. > Include internal data sets in documentation Sample Datasets > --- > > Key: DRILL-6667 > URL: https://issues.apache.org/jira/browse/DRILL-6667 > Project: Apache Drill > Issue Type: Improvement > Components: Documentation >Affects Versions: 1.13.0 >Reporter: Paul Rogers >Assignee: Bridget Bevens >Priority: Minor > > The Drill documentation provides the "Sample Datasets" section, which is very > handy. However, this section does not discuss the two datasets provided with > Drill itself. > * Julian Hyde's [FoodMart data > set|https://github.com/julianhyde/foodmart-data-hsqldb], available on the > class path. > * TPC-H data set, available on the class path in {{tpch}} > The "FoodMart" data set is available directly under {{cp}}. In fact, the > Drill sample query (see below) references a FoodMart table. To see the list > of tables (at development time), find the {{foodmark-data-json-0.4.jar}} file > in the Maven dependencies for {{drill-java-exec}}. The table names here are > simplified relative to those in the ER diagram in the above link. Perha
[jira] [Updated] (DRILL-6667) Include internal data sets in documentation Sample Datasets
[ https://issues.apache.org/jira/browse/DRILL-6667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers updated DRILL-6667: --- Summary: Include internal data sets in documentation Sample Datasets (was: Include internal data sets in Documentation Sample Datasets) > Include internal data sets in documentation Sample Datasets > --- > > Key: DRILL-6667 > URL: https://issues.apache.org/jira/browse/DRILL-6667 > Project: Apache Drill > Issue Type: Improvement > Components: Documentation >Affects Versions: 1.13.0 >Reporter: Paul Rogers >Assignee: Bridget Bevens >Priority: Minor > > The Drill documentation provides the "Sample Datasets" section, which is very > handy. However, this section does not discuss the two datasets provided with > Drill itself. > * Julian Hyde's [FoodMart data > set|https://github.com/julianhyde/foodmart-data-hsqldb], available on the > class path. > * TPC-H data set, available on the class path in {{tpch}} > The "FoodMart" data set is available directly under {{cp}}. In fact, the > Drill sample query (see below) references a FoodMart table. To see the list > of tables (at development time), find the {{foodmark-data-json-0.4.jar}} file > in the Maven dependencies for {{drill-java-exec}}. The table names here are > simplified relative to those in the ER diagram in the above link. Perhaps > include a simple table with names, and the mapping to the original names, and > a link to (or just embed the link) to the FoodMart ER image. The data is > available in JSON format. > TPCH data is available in {{`cp`.`tpch/*.parquet`}}, in Parquet format. The > schema is described in the [TPC-H > specification|http://www.tpc.org/tpc_documents_current_versions/current_specifications.asp]. > Further, in the "Tutorials" section, "Analyzing the Yelp Academic Dataset", > we mention the Yelp data set. But, we don't mention that in the "Sample > Datasets" section. We should, just to be consistent and to save the reader > time when going back and saying, "Hey, didn't Drill provide some kind of Yelp > data? Let me look in Sample Datasets. Wait.. no Yelp?" > These are very handy, but hard to find: I find I must keep searching the > source code to remember file names and directory paths. End uses won't have > this luxury. > Suggestion: Describe the files available in the class path data source. > Along these same lines, in "Connect a Data Source", there is no mention of > the class path data source. Yet, we reference that data source in the Web > Console where we suggest a sample query to run: > {code} > Sample SQL query: SELECT * FROM cp.`employee.json` LIMIT 20 > {code} > The above query refers to the FoodMart data set. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6667) Include internal data sets in Documentation Sample Datasets
[ https://issues.apache.org/jira/browse/DRILL-6667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers updated DRILL-6667: --- Description: The Drill documentation provides the "Sample Datasets" section, which is very handy. However, this section does not discuss the two datasets provided with Drill itself. * Julian Hyde's [FoodMart data set|https://github.com/julianhyde/foodmart-data-hsqldb], available on the class path. * TPC-H data set, available on the class path in {{tpch}} The "FoodMart" data set is available directly under {{cp}}. In fact, the Drill sample query (see below) references a FoodMart table. To see the list of tables (at development time), find the {{foodmark-data-json-0.4.jar}} file in the Maven dependencies for {{drill-java-exec}}. The table names here are simplified relative to those in the ER diagram in the above link. Perhaps include a simple table with names, and the mapping to the original names, and a link to (or just embed the link) to the FoodMart ER image. The data is available in JSON format. TPCH data is available in {{`cp`.`tpch/*.parquet`}}, in Parquet format. The schema is described in the [TPC-H specification|http://www.tpc.org/tpc_documents_current_versions/current_specifications.asp]. Further, in the "Tutorials" section, "Analyzing the Yelp Academic Dataset", we mention the Yelp data set. But, we don't mention that in the "Sample Datasets" section. We should, just to be consistent and to save the reader time when going back and saying, "Hey, didn't Drill provide some kind of Yelp data? Let me look in Sample Datasets. Wait.. no Yelp?" These are very handy, but hard to find: I find I must keep searching the source code to remember file names and directory paths. End uses won't have this luxury. Suggestion: Describe the files available in the class path data source. Along these same lines, in "Connect a Data Source", there is no mention of the class path data source. Yet, we reference that data source in the Web Console where we suggest a sample query to run: {code} Sample SQL query: SELECT * FROM cp.`employee.json` LIMIT 20 {code} The above query refers to the FoodMart data set. was: The Drill documentation provides the "Sample Datasets" section, which is very handy. However, this section does not discuss the two datasets provided with Drill itself. * Julian Hyde's [FoodMart data set|https://github.com/julianhyde/foodmart-data-hsqldb], available on the class path. * TPC-H data set. The "FoodMart" data set is available directly under {{cp}}. In fact, the Drill sample query (see below) references a FoodMart table. To see the list of tables (at development time), find the {{foodmark-data-json-0.4.jar}} file in the Maven dependencies for {{drill-java-exec}}. The table names here are simplified relative to those in the ER diagram in the above link. Perhaps include a simple table with names, and the mapping to the original names, and a link to (or just embed the link) to the FoodMart ER image. The data is available in JSON format. TPCH data is available in `cp`.`tpch/*.parquet`, in Parquet format. The schema is described in the [TPC-H specification|http://www.tpc.org/tpc_documents_current_versions/current_specifications.asp]. Further, in the "Tutorials" section, "Analyzing the Yelp Academic Dataset", we mention the Yelp data set. But, we don't mention that in the "Sample Datasets" section. We should, just to be consistent and to save the reader time when going back and saying, "Hey, didn't Drill provide some kind of Yelp data? Let me look in Sample Datasets. Wait.. no Yelp?" These are very handy, but hard to find: I find I must keep searching the source code to remember file names and directory paths. End uses won't have this luxury. Suggestion: Describe the files available in the class path data source. Along these same lines, in "Connect a Data Source", there is no mention of the class path data source. Yet, we reference that data source in the Web Console where we suggest a sample query to run: {code} Sample SQL query: SELECT * FROM cp.`employee.json` LIMIT 20 {code} The above query refers to the FoodMart data set. > Include internal data sets in Documentation Sample Datasets > --- > > Key: DRILL-6667 > URL: https://issues.apache.org/jira/browse/DRILL-6667 > Project: Apache Drill > Issue Type: Improvement > Components: Documentation >Affects Versions: 1.13.0 >Reporter: Paul Rogers >Assignee: Bridget Bevens >Priority: Minor > > The Drill documentation provides the "Sample Datasets" section, which is very > handy. However, this section does not discuss the two datasets provided with > Drill itself. > * Julian Hyde's [FoodMart data > set|https://github.com/julianhyde/foodmart-data-hsqldb]
[jira] [Updated] (DRILL-6667) Include internal data sets in Documentation Sample Datasets
[ https://issues.apache.org/jira/browse/DRILL-6667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers updated DRILL-6667: --- Description: The Drill documentation provides the "Sample Datasets" section, which is very handy. However, this section does not discuss the two datasets provided with Drill itself. * Julian Hyde's [FoodMart data set|https://github.com/julianhyde/foodmart-data-hsqldb], available on the class path. * TPC-H data set. The "FoodMart" data set is available directly under {{cp}}. In fact, the Drill sample query (see below) references a FoodMart table. To see the list of tables (at development time), find the {{foodmark-data-json-0.4.jar}} file in the Maven dependencies for {{drill-java-exec}}. The table names here are simplified relative to those in the ER diagram in the above link. Perhaps include a simple table with names, and the mapping to the original names, and a link to (or just embed the link) to the FoodMart ER image. The data is available in JSON format. TPCH data is available in `cp`.`tpch/*.parquet`, in Parquet format. The schema is described in the [TPC-H specification|(http://www.tpc.org/tpc_documents_current_versions/current_specifications.asp]. Further, in the "Tutorials" section, "Analyzing the Yelp Academic Dataset", we mention the Yelp data set. But, we don't mention that in the "Sample Datasets" section. We should, just to be consistent and to save the reader time when going back and saying, "Hey, didn't Drill provide some kind of Yelp data? Let me look in Sample Datasets. Wait.. no Yelp?" These are very handy, but hard to find: I find I must keep searching the source code to remember file names and directory paths. End uses won't have this luxury. Suggestion: Describe the files available in the class path data source. Along these same lines, in "Connect a Data Source", there is no mention of the class path data source. Yet, we reference that data source in the Web Console where we suggest a sample query to run: {code} Sample SQL query: SELECT * FROM cp.`employee.json` LIMIT 20 {code} The above query refers to the FoodMart data set. was: The Drill documentation provides the "Sample Datasets" section, which is very handy. However, this section does not discuss the two datasets provided with Drill itself. * Julian Hyde's [FoodMart data set|https://github.com/julianhyde/foodmart-data-hsqldb], available on the class path. * TPC-H data set. The "FoodMart" data set is available directly under {{cp}}. In fact, the Drill sample query (see below) references a FoodMart table. To see the list of tables (at development time), find the {{foodmark-data-json-0.4.jar}} file in the Maven dependencies for {{drill-java-exec}}. The table names here are simplified relative to those in the ER diagram in the above link. Perhaps include a simple table with names, and the mapping to the original names, and a link to (or just embed the link) to the FoodMart ER image. The data is available in JSON format. TPCH data is available in `cp`.`tpch/*.parquet`, in Parquet format. The schema is described in the [TPC-H specification|(http://www.tpc.org/tpc_documents_current_versions/current_specifications.asp]. Further, in the "Tutorials" section, "Analyzing the Yelp Academic Dataset", we mention the Yelp data set. But, we don't mention that in the "Sample Datasets" section. We should, just to be consistent and to save the reader time when going back and saying, "Hey, didn't Drill provide some kind of Yelp data? Let me look in Sample Datasets. Wait.. no Yelp?" These are very handy, but hard to find: I find I must keep searching the source code to remember file names and directory paths. End uses won't have this luxury. Suggestion: Describe the files available in the class path data source. Along these same lines, in "Connect a Data Source", there is no mention of the class path data source. Yet, we reference that data source in the Web Console where we suggest a sample query to run: {code} Sample SQL query: SELECT * FROM cp.`employee.json` LIMIT 20 {code} > Include internal data sets in Documentation Sample Datasets > --- > > Key: DRILL-6667 > URL: https://issues.apache.org/jira/browse/DRILL-6667 > Project: Apache Drill > Issue Type: Improvement > Components: Documentation >Affects Versions: 1.13.0 >Reporter: Paul Rogers >Assignee: Bridget Bevens >Priority: Minor > > The Drill documentation provides the "Sample Datasets" section, which is very > handy. However, this section does not discuss the two datasets provided with > Drill itself. > * Julian Hyde's [FoodMart data > set|https://github.com/julianhyde/foodmart-data-hsqldb], available on the > class path. > * TPC-H data set. > The "FoodMart" data set is available
[jira] [Updated] (DRILL-6667) Include internal data sets in Documentation Sample Datasets
[ https://issues.apache.org/jira/browse/DRILL-6667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers updated DRILL-6667: --- Description: The Drill documentation provides the "Sample Datasets" section, which is very handy. However, this section does not discuss the two datasets provided with Drill itself. * Julian Hyde's [FoodMart data set|https://github.com/julianhyde/foodmart-data-hsqldb], available on the class path. * TPC-H data set. The "FoodMart" data set is available directly under {{cp}}. In fact, the Drill sample query (see below) references a FoodMart table. To see the list of tables (at development time), find the {{foodmark-data-json-0.4.jar}} file in the Maven dependencies for {{drill-java-exec}}. The table names here are simplified relative to those in the ER diagram in the above link. Perhaps include a simple table with names, and the mapping to the original names, and a link to (or just embed the link) to the FoodMart ER image. The data is available in JSON format. TPCH data is available in `cp`.`tpch/*.parquet`, in Parquet format. The schema is described in the [TPC-H specification|http://www.tpc.org/tpc_documents_current_versions/current_specifications.asp]. Further, in the "Tutorials" section, "Analyzing the Yelp Academic Dataset", we mention the Yelp data set. But, we don't mention that in the "Sample Datasets" section. We should, just to be consistent and to save the reader time when going back and saying, "Hey, didn't Drill provide some kind of Yelp data? Let me look in Sample Datasets. Wait.. no Yelp?" These are very handy, but hard to find: I find I must keep searching the source code to remember file names and directory paths. End uses won't have this luxury. Suggestion: Describe the files available in the class path data source. Along these same lines, in "Connect a Data Source", there is no mention of the class path data source. Yet, we reference that data source in the Web Console where we suggest a sample query to run: {code} Sample SQL query: SELECT * FROM cp.`employee.json` LIMIT 20 {code} The above query refers to the FoodMart data set. was: The Drill documentation provides the "Sample Datasets" section, which is very handy. However, this section does not discuss the two datasets provided with Drill itself. * Julian Hyde's [FoodMart data set|https://github.com/julianhyde/foodmart-data-hsqldb], available on the class path. * TPC-H data set. The "FoodMart" data set is available directly under {{cp}}. In fact, the Drill sample query (see below) references a FoodMart table. To see the list of tables (at development time), find the {{foodmark-data-json-0.4.jar}} file in the Maven dependencies for {{drill-java-exec}}. The table names here are simplified relative to those in the ER diagram in the above link. Perhaps include a simple table with names, and the mapping to the original names, and a link to (or just embed the link) to the FoodMart ER image. The data is available in JSON format. TPCH data is available in `cp`.`tpch/*.parquet`, in Parquet format. The schema is described in the [TPC-H specification|(http://www.tpc.org/tpc_documents_current_versions/current_specifications.asp]. Further, in the "Tutorials" section, "Analyzing the Yelp Academic Dataset", we mention the Yelp data set. But, we don't mention that in the "Sample Datasets" section. We should, just to be consistent and to save the reader time when going back and saying, "Hey, didn't Drill provide some kind of Yelp data? Let me look in Sample Datasets. Wait.. no Yelp?" These are very handy, but hard to find: I find I must keep searching the source code to remember file names and directory paths. End uses won't have this luxury. Suggestion: Describe the files available in the class path data source. Along these same lines, in "Connect a Data Source", there is no mention of the class path data source. Yet, we reference that data source in the Web Console where we suggest a sample query to run: {code} Sample SQL query: SELECT * FROM cp.`employee.json` LIMIT 20 {code} The above query refers to the FoodMart data set. > Include internal data sets in Documentation Sample Datasets > --- > > Key: DRILL-6667 > URL: https://issues.apache.org/jira/browse/DRILL-6667 > Project: Apache Drill > Issue Type: Improvement > Components: Documentation >Affects Versions: 1.13.0 >Reporter: Paul Rogers >Assignee: Bridget Bevens >Priority: Minor > > The Drill documentation provides the "Sample Datasets" section, which is very > handy. However, this section does not discuss the two datasets provided with > Drill itself. > * Julian Hyde's [FoodMart data > set|https://github.com/julianhyde/foodmart-data-hsqldb], available on the > class path. > * TPC-H
[jira] [Updated] (DRILL-6667) Include internal data sets in Documentation Sample Datasets
[ https://issues.apache.org/jira/browse/DRILL-6667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers updated DRILL-6667: --- Description: The Drill documentation provides the "Sample Datasets" section, which is very handy. However, this section does not discuss the two datasets provided with Drill itself. * Julian Hyde's [FoodMart data set|https://github.com/julianhyde/foodmart-data-hsqldb], available on the class path. * TPC-H data set. The "FoodMart" data set is available directly under {{cp}}. In fact, the Drill sample query (see below) references a FoodMart table. To see the list of tables (at development time), find the {{foodmark-data-json-0.4.jar}} file in the Maven dependencies for {{drill-java-exec}}. The table names here are simplified relative to those in the ER diagram in the above link. Perhaps include a simple table with names, and the mapping to the original names, and a link to (or just embed the link) to the FoodMart ER image. The data is available in JSON format. TPCH data is available in `cp`.`tpch/*.parquet`, in Parquet format. The schema is described in the [TPC-H specification|(http://www.tpc.org/tpc_documents_current_versions/current_specifications.asp]. Further, in the "Tutorials" section, "Analyzing the Yelp Academic Dataset", we mention the Yelp data set. But, we don't mention that in the "Sample Datasets" section. We should, just to be consistent and to save the reader time when going back and saying, "Hey, didn't Drill provide some kind of Yelp data? Let me look in Sample Datasets. Wait.. no Yelp?" These are very handy, but hard to find: I find I must keep searching the source code to remember file names and directory paths. End uses won't have this luxury. Suggestion: Describe the files available in the class path data source. Along these same lines, in "Connect a Data Source", there is no mention of the class path data source. Yet, we reference that data source in the Web Console where we suggest a sample query to run: {code} Sample SQL query: SELECT * FROM cp.`employee.json` LIMIT 20 {code} was: The Drill documentation provides the "Sample Datasets" section, which is very handy. However, this section does not discuss the two datasets provided with Drill itself. * Julian Hyde's [FoodMart data set|https://github.com/julianhyde/foodmart-data-hsqldb], available on the class path. * TPC-H data set. The "FoodMart" data set is available directly under {{cp}}. In fact, the Drill sample query (see below) references a FoodMart table. To see the list of tables (at development time), find the {{foodmark-data-json-0.4.jar}} file in the Maven dependencies for {{drill-java-exec}}. The table names here are simplified relative to those in the ER diagram in the above link. Perhaps include a simple table with names, and the mapping to the original names, and a link to (or just embed the link) to the FoodMart ER image. The data is available in JSON format. TPCH data is available in `cp`.`tpch/*.parquet`, in Parquet format. The schema is described in the [TPC-H specification](http://www.tpc.org/tpc_documents_current_versions/current_specifications.asp). Further, in the "Tutorials" section, "Analyzing the Yelp Academic Dataset", we mention the Yelp data set. But, we don't mention that in the "Sample Datasets" section. We should, just to be consistent and to save the reader time when going back and saying, "Hey, didn't Drill provide some kind of Yelp data? Let me look in Sample Datasets. Wait.. no Yelp?" These are very handy, but hard to find: I find I must keep searching the source code to remember file names and directory paths. End uses won't have this luxury. Suggestion: Describe the files available in the class path data source. Along these same lines, in "Connect a Data Source", there is no mention of the class path data source. Yet, we reference that data source in the Web Console where we suggest a sample query to run: {code} Sample SQL query: SELECT * FROM cp.`employee.json` LIMIT 20 {code} > Include internal data sets in Documentation Sample Datasets > --- > > Key: DRILL-6667 > URL: https://issues.apache.org/jira/browse/DRILL-6667 > Project: Apache Drill > Issue Type: Improvement > Components: Documentation >Affects Versions: 1.13.0 >Reporter: Paul Rogers >Assignee: Bridget Bevens >Priority: Minor > > The Drill documentation provides the "Sample Datasets" section, which is very > handy. However, this section does not discuss the two datasets provided with > Drill itself. > * Julian Hyde's [FoodMart data > set|https://github.com/julianhyde/foodmart-data-hsqldb], available on the > class path. > * TPC-H data set. > The "FoodMart" data set is available directly under {{cp}}. In fact, the > Drill samp