[ 
https://issues.apache.org/jira/browse/SOLR-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Thacker updated SOLR-6127:
--------------------------------

    Attachment: freebase_film_dump.py

I thought Freebase would be a good place to get data from. 

[~thetaphi] - Would using the data from freebase ( 
https://developers.google.com/freebase/faq#rules_for_using_data ) be a 
licensing issue?

If thats not a concern here is a script which fetches 200 rows of film data ( 
http://www.freebase.com/film ) and dumps it into JSON, XML and CSV.

The number of documents can be adjusted. You would need to put in the API KEY 
for it to run.

Any opinions if this is a good idea?

> Improve Solr's exampledocs data
> -------------------------------
>
>                 Key: SOLR-6127
>                 URL: https://issues.apache.org/jira/browse/SOLR-6127
>             Project: Solr
>          Issue Type: Improvement
>          Components: documentation
>            Reporter: Varun Thacker
>            Priority: Minor
>             Fix For: 5.0
>
>         Attachments: freebase_film_dump.py
>
>
> Currently 
> - The CSV example has 10 documents.
> - The JSON example has 4 documents.
> - The XML example has 32 documents.
> 1. We should have equal number of documents and the same documents in all the 
> example formats
> 2. A data set which is slightly more comprehensive.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to