Re: [sqlite] Inserting Values in Bulk

Pablo Santacruz Wed, 13 Sep 2006 08:28:55 -0700

Try this, it may suits you.
(I'm using pysqlite from trac)


from pysqlite2 import dbapi2 as sqlite
import csv

con = sqlite.connect("mydb")
cursor = con.cursor()
reader = csv.reader(open("some.csv", "rb"))
qmarks = "?," * 30 + "?"

#1
for row in reader: #row is a tuple
   cursor.execute("insert into pr values (" + qmarks + )", row)

#Alternative to #1 (instead of iterating)
#cursor.executemany("insert into pr values (" + qmarks + ")", reader)




On 9/12/06, Dennis Cote <[EMAIL PROTECTED]> wrote:

Rich Shepard wrote:
>
>   Yes, 'row' is the string of values.
>
>> Note, you have only got 4 columns in your column list, so you will
>> get an
>> error if you feed in 31 columns of data.
>
>   I've corrected this already.
>
>> You may need to extract the relevant columns using split to separate
>> your
>> string at the commas, collecting the data for the columns you need, and
>> then reassembling only those columns into a new string of row data.
>
>   This is what I thought that I did for the first three fields (a
> sequential
> record ID that relates the digital record to a paper form), and the
> two text
> fields. Those are inserted in individual statements. I wasn't at all
> clear
> about the rest of the values; that's why I asked.
>
>   The statement now reads:
>
> # now get the pair-wise vote values for rows 7-34, bytes 12-67, with
> dictionary lookups
> pw = split_line[7:67:2]
> self.cur.execute("insert into voting(pr1, pr2, pr3, pr4, \
>     pr5, pr6, pr7, pr8, pr9, pr10, pr11, pr12, pr13, pr14, pr15, \
>     pr16, pr17, pr18, pr19, pr20, pr21, pr22, pr23, pr24, pr25, \
>     pr26, pr27, pr28) values (DATA_MAP_7(pw))")
>
>   If I understand your revision, I need to change the end of the above
> statement to values (DATA_MAP_7(" + pw + "). Does this look better?
>
>
Rich,

I'm still not sure about exactly what you are trying to do, but what I
was suggesting was building your SQL insert statement as a string in
python by concatenating the constant part of the statement (the insert
keyword, the table name , and the list of columns) with the variable
part of the statement (the data for each row in your table). The +
operator is used to concatenate strings into larger strings in python.
Then passing your complete SQL statement to the execute method.

This was a suggestion of a quick and dirty way to get your data into a
table since you said you already had the data for the row in a string
variable, with columns separated by commas, and text fields delimited by
quotes.

If that is not the case, there are better ways to put data into a table.

First, it is generally frown on to have repeated data in columns of a
relational database. These columns should be normalized into another
table and related to the associated record in the existing table using
its  record id  (hence the name relational database). Instead of a table
like this:

create table votes (
    id integer primary key,
    cat text,
    pos text,
    pr1 float,
    pr2 float,
    pr3 float,
    ...
    pr28 float
)

You should normalize the floats into a separate table. This table will
have two columns if the order of the prn fields does not matter, or
three if it does.

create table pr(
    vote_id integer references votes(id),
    pr float
)

or

create table pr(
    vote_id integer references votes(id),
    n integer,
    pr float
)

And your votes table will have only the first three columns.

create table votes (
    id integer primary key,
    cat text,
    pos text,
)

Now when you insert a record into the votes table, you will also insert
28 records into the pr table. Each of these 28 records will have the
same value for the vote_id, the value of the id just inserted into the
votes table. If the order of the pr fields is important then you need to
track the order in the pr table by setting the column n to match the pr
field number as each pr value is inserted.

The following sample shows how to do this for a single CSV data row in a
string variable row.

def dequote(s):
    if s.startswith('"') and s.endswith('"'):
       s = s[1:-1]
       s = s.replace('""', '"')
    return s

# separate fields and dequote strings from CSV row
fields = row.split(',')
fields = map(dequote, fields)
# get id of this row
id = int(fields[0])
# insert the row into the main table
cur.execute("insert into votes values(?, ?, ?)", (id, fields[1],
fields[2]))
# insert the repeated pr fields into the associated pr table
for n in range(28):
    cur.execute("insert into pr values(?, ?, ?)", (id, n+1,
float(fields[n+3]))

Even if you don't normalize your table this sample should show how to
use parameters (the question marks in the SQL) to make your SQL clearer
and safer (and due to statement caching in pysqlite it should also
perform faster).

HTH
Dennis Cote

-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]

-----------------------------------------------------------------------------



--
Pablo

Re: [sqlite] Inserting Values in Bulk

Reply via email to