John, On 11/19/2013 04:28 AM, John Norris wrote: > I am trying to retrofit an existing web app using SQL to couchdb with > ektorp. I have setup couchdb and run through some tutorials (ektorp, > seven databases in seven weeks, definitive guide). > If I have several objects (represented as pojos) then in SQL this > would probably equate to several tables within the database. > But in couchdb, a database is a number of documents? And those > documents can represent several object types? So do I need each > document to have a field representing what type it is? (eg a field > that is unique to that document type). So far as I can understand your question, it depends on whether each pojo object contains many rows of similar data. If they don't then represent each object as one doc, like this:
{ _id:"12345", type:"pojo" } > Or does each document type go in its own database? If each pojo object contains many rows of similar data, I'd probably break it up into one document per row and keep all the pojos in the same database, so I could query across all pojos. I don't think it's possible to query across multiple databases in CouchDB. Here are two files I use to migrate data from sqlite to CouchDB, which I offer here as an example for any who are doing similar work: csv2json.py: ---------------- #!/usr/bin/env python import csv, sys, json # Open the file passed as a command line argument f = open(sys.argv[1], 'r') reader = csv.DictReader(f) rows = [] for row in reader: for key in row.keys(): # Remove underscore from beginning of attribute names if key.startswith('_'): new_key = key.lstrip('_') row[new_key] = row[key] del row[key] # Insert document collection column, which equals the sqlite table name row['collection'] = sys.argv[2] # Convert id column to namespaced id to avoid conflicts if key == 'id': row['_id'] = sys.argv[2] + '.' + row['id'] del row['id'] if key == 'user_id': row['_id'] = sys.argv[2] + '.' + row['user_id'] del row['user_id'] if key == 'type': row['job'] = row['type'] del row['type'] rows.append(row) # Wrap in CouchDB _bulk_docs JSON format out = '{"docs":%s}' % json.dumps(rows) print(out) ----------------- sqlite2csv2couchdb.sh ------------------ #!/bin/bash # Get the database from the production site scp remote_host:path/to/sqlite.db . DB="http://username:password@localhost:5984/projects" # TODO: Use filtered replication to save the design docs # Delete old copy of database curl -X DELETE $DB # Wait a second to let CouchDB delete the old database. sleep 1 # Create new copy of database curl -X PUT $DB # TODO: Set permissions on couchdb database # Create list of tables tables=`sqlite3 devdata.db 'SELECT tbl_name FROM sqlite_master WHERE type="table"'` while read -r line; do # Filter out the visits tables if [ "$line" != "visit" ] && [ "$line" != "visit_identity" ] then # Get table of data rows=$(sqlite3 -csv -header sqlite.db "SELECT * FROM $line") echo "$rows" > tmp.csv rows=$( python csv2json.py tmp.csv $line ) # write JSON to file to avoid curl error of having too many command line arguments echo "$rows" > tmp.json # Insert table into couchdb curl -d @tmp.json -H "Content-Type:application/json" -X POST $DB/_bulk_docs &> /dev/null fi done <<< "$tables" rm tmp.json rm tmp.csv rm devdata.db --------------------- Tim