Paul Rogers created DRILL-4709:
----------------------------------

             Summary: Document the included Foodmart sample data
                 Key: DRILL-4709
                 URL: https://issues.apache.org/jira/browse/DRILL-4709
             Project: Apache Drill
          Issue Type: Improvement
          Components: Documentation
    Affects Versions: 1.6.0
            Reporter: Paul Rogers
            Priority: Minor


Drill includes a JSON version of the Mondrian FoodMart sample data. This data 
appears in the $DRILL_HOME/jars/3rdparty/foodmart-data-json-0.4.jar jar file, 
accessible using the class path storage plugin.

The documentation mentions using the cp plugin to access customers.json. 
However, the FoodMart data set is quite rich, with many example files.

As it is, unless someone is a curious developer, and good with Google, they 
won't be able to find the other data sets or the source of the FoodMart data.

The data appears to be a JSON version of the SQL sample data for the Mondrian 
project. A schema description is here: 
https://github.com/pentaho/mondrian/blob/master/demo/FoodMart.xml

The Mondrian data appears to have originated at Microsoft to highlight their 
circa 2000 OLAP projects, but has since been discontinued. See

* http://sqlmag.com/development/dts-2000-action
* https://technet.microsoft.com/en-us/library/aa217032(v=sql.80).aspx
* http://sqlmag.com/sql-server/desperately-seeking-samples

Or do a Google search for "microsoft foodmart database".

The request is to:

1. Credit MS and Mondrian for the data.
2. Either explain the data (which is quite a bit of work), or
3. Explain how to extract the files from the jar file to explore manually.
4. Provide a pointer to a description of the schema (if such can be found.)

For option 3:

cd $DRILL_HOME/jars/3rdparty
unzip foodmart-data-json-0.4.jar -d ~/foodmart
cd ~/foodmart
ls

Looking at the data, it is clear that SOME description is needed to understand 
the many tables and how they might work with Drill.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to