Paul Rogers created DRILL-4709: ---------------------------------- Summary: Document the included Foodmart sample data Key: DRILL-4709 URL: https://issues.apache.org/jira/browse/DRILL-4709 Project: Apache Drill Issue Type: Improvement Components: Documentation Affects Versions: 1.6.0 Reporter: Paul Rogers Priority: Minor
Drill includes a JSON version of the Mondrian FoodMart sample data. This data appears in the $DRILL_HOME/jars/3rdparty/foodmart-data-json-0.4.jar jar file, accessible using the class path storage plugin. The documentation mentions using the cp plugin to access customers.json. However, the FoodMart data set is quite rich, with many example files. As it is, unless someone is a curious developer, and good with Google, they won't be able to find the other data sets or the source of the FoodMart data. The data appears to be a JSON version of the SQL sample data for the Mondrian project. A schema description is here: https://github.com/pentaho/mondrian/blob/master/demo/FoodMart.xml The Mondrian data appears to have originated at Microsoft to highlight their circa 2000 OLAP projects, but has since been discontinued. See * http://sqlmag.com/development/dts-2000-action * https://technet.microsoft.com/en-us/library/aa217032(v=sql.80).aspx * http://sqlmag.com/sql-server/desperately-seeking-samples Or do a Google search for "microsoft foodmart database". The request is to: 1. Credit MS and Mondrian for the data. 2. Either explain the data (which is quite a bit of work), or 3. Explain how to extract the files from the jar file to explore manually. 4. Provide a pointer to a description of the schema (if such can be found.) For option 3: cd $DRILL_HOME/jars/3rdparty unzip foodmart-data-json-0.4.jar -d ~/foodmart cd ~/foodmart ls Looking at the data, it is clear that SOME description is needed to understand the many tables and how they might work with Drill. -- This message was sent by Atlassian JIRA (v6.3.4#6332)