Re: [sqlite] Simple read-only program very slow

Keith Medcalf Sun, 03 Dec 2017 08:50:52 -0800

On Sunday, 3 December, 2017 08:24, Richard Rousselot 
<richard.rousse...@gmail.com> wrote:


>Had similar issue a few years ago; we were using the SQLite3.exe.  We
>recompiled the exe as 64 bit which allowed it to use more than 4 GB
>of ram, loaded the machine with as much memory as we could.  Voila, 
>super fast processing.

>Can the Python libraries be made 64 bit some how?

Yes.  You must be using a 64-bit version of Python and the procedure to replace 
the sqlite3.dll / sqlite3.so it uses is the same as for the 32-bit version, or 
to compile and use a 64-bit version of the apsw extension is unchanged.

Neither Windows nor Linux can thunk a dynamic load module such that the one 
used is a different model than the running process (it was proprietary IBM 
technology that no one else seems smart enough to duplicate), so you have to 
update Python to the 64-bit model as well.

On Windows 10 16299.98 I have both a 32-bit (Python 2.7.14) and 64-bit (Python 
3.6.4) installed and build 32-bit DLLs for the former and 64-bit for the latter 
from the same source (just selecting -m32 or -m64 as appropriate).  I use the 
MinGW64/GCC compiler because (a) it can compile in either model depending on 
the switch you use without requiring any code changes, (b) supports long long 
and long double in 32-bit; and, (c) does not require the use of the Microsoft C 
Runtime "moving target" libraries -- it can compile to the subsystem runtime 
(MSVCRT) that has been stable since, oh, the first OS/2 New Technology (which 
later became Windows NT) way back when.  Oh, and MinGW/GCC does "true" position 
independent code and when you do a static link of a module to either and 
executable or dynamic load library, it is truly static with no strange external 
dependencies.

Since SQLite3 is heavily I/O bound (or at least syscall/kernel call bound for 
mutexes, etc) in practically everything it does, the 64-bit version is much 
faster (about 20%) than the 32-bit version, when running on a 64-bit OS, since 
the OS does not have to thunk the call stack when accessing/returning from  the 
kernel.

>On Thu, Nov 30, 2017 at 7:01 PM Keith Medcalf <kmedc...@dessus.com>
>wrote:
>
>>
>> Is there an index on pos where ppos is the left-most field (or the
>only
>> field) in the index?
>> What is the column affinity of ppos?  Of the fiold you are passing
>as a
>> parameter?
>> Is ppos unique?
>>
>> If you CREATE [UNIQUE] INDEX searchindex ON pos (ppos, pnum);
>>
>> then your query can be satisfied only using the searchindex
>covering index.
>>
>> If there is not an index on ppos, then you will be wasting time
>recreating
>> the index for each query.
>>
>> You will probably need to increase the cache size beyond the paltry
>> default in order for the entire btree structures to be cached in
>RAM -- you
>> probably want to make it as big as you can.
>>
>> ---
>> The fact that there's a Highway to Hell but only a Stairway to
>Heaven says
>> a lot about anticipated traffic volume.
>>
>>
>> >-----Original Message-----
>> >From: sqlite-users [mailto:sqlite-users-
>> >boun...@mailinglists.sqlite.org] On Behalf Of Kevin O'Gorman
>> >Sent: Saturday, 25 November, 2017 20:14
>> >To: sqlite-users
>> >Subject: [sqlite] Simple read-only program very slow
>> >
>> >I'm pretty new at SQLite, so this may seem obvious to you.  Be
>kind.
>> >I'm using Python on Ubuntu Linux 16.04 LTS, and the sqlite that is
>> >built
>> >into Python.  The database
>> >is using WAL.
>> >
>> >I've got a database of some 100 million records, and a file of
>just
>> >over
>> >300 thousand that I want represented in it.  I wanted to check how
>> >much
>> >difference it was going to make, so I wrote a super
>> >simple program to the read the file and count how many records are
>> >already
>> >there.  I got impatient waiting for it so I killed the process and
>> >added an
>> >output of one dot (".") per 1000 records.  It went very fast for
>what
>> >I
>> >estimate was around 200 dots and hit a wall.  It made progress,
>but
>> >very
>> >very slowly.
>> >
>> >So I killed it again and added a commit() call every time it
>output a
>> >dot.
>> >It didn't hit a wall, just some rough road (that is, it slowed
>down
>> >at
>> >about the same spot but not nearly so drastically).
>> >
>> >The code makes to changes to the database at all.  Why does
>commit()
>> >make a
>> >difference?  What else should I learn from this?
>> >
>> >The field being used for the lookup has an index.
>> >
>> >++ kevin
>> >
>> >Code follows:
>> >#!/usr/bin/env python3
>> >"""Count the number of records that represent rows in the database
>> >'pos'
>> >table.
>> >The database is not modified.
>> >
>> > Last Modified: Sat Nov 25 18:56:49 PST 2017
>> >"""
>> >
>> >import os.path          #
>> >https://docs.python.org/3.5/library/os.path.html
>> >import sys              #
>> >https://docs.python.org/3.5/library/sys.html
>> >import argparse         #
>> >https://docs.python.org/3.5/library/argparse.html
>> >import sqlite3          #
>> >https://docs.python.org/3.5/library/sqlite3.html
>> >import re               #
>https://docs.python.org/3.5/library/re.html
>> >
>> ># from /usr/local/lib/python3.5/dist-packages
>> >import qcreate
>> >from qerror import *
>> >import myparser
>> >
>> >if __name__ == '__main__':
>> >    parser = argparse.ArgumentParser(description="""A program to
>read
>> >positions and count how many are
>> >            in the database""",)
>> >    parser.add_argument("--dbname", default=None,
>> >            help="name of the database to work on (overrides
>> >qubic.ini
>> >file)")
>> >    parser.add_argument("file", nargs='?',
>> >type=argparse.FileType('r'),
>> >default=sys.stdin,
>> >                help="file containing the qsearch results (default
>> >stdin)")
>> >    args=parser.parse_args()
>> >    infile = args.file
>> >
>> >    if args.dbname is None:
>> >        here=os.path.split(os.path.realpath('.'))[1]
>> >        for confdir in
>> >".",os.path.join(os.environ["HOME"],".config"),"/etc/xdg/":
>> >            f = os.path.join(confdir, "qubic.ini")
>> >            if os.path.exists(f):
>> >                args.dbname = myparser.parse(f, here, "dbname")
>> >                if args.dbname is not None:
>> >                    break
>> >        if args.dbname is None:
>> >            print(" *** ERROR: no database name provided and none
>> >found in
>> >qubic.ini files")
>> >            sys.exit(1)
>> >
>> >    present = missing = lines = 0
>> >    with sqlite3.connect(args.dbname) as conn:
>> >        for line in infile:
>> >            fields = line.split()
>> >            pos = fields[0]
>> >            if len(pos) != 64: # Important test to catch grep
>without
>> >--no-filename
>> >                raise InputError(" ERROR: input line has wrong-
>sized
>> >position: " + line)
>> >
>> >            pnum = None
>> >            for row in conn.execute("""
>> >                        SELECT pnum
>> >                        FROM pos
>> >                        WHERE ppos=?
>> >                    """,(pos,)):
>> >                pnum = row[0]
>> >                break
>> >            if pnum is None:
>> >                missing += 1
>> >            else:
>> >                present += 1
>> >            lines += 1
>> >            if lines % 1000 == 0:
>> >                print(".",flush=True,end="")
>> >                conn.commit()
>> >    print("there were",present,"records on file and",missing,"
>were
>> >missing")
>> >    print("out of a total of", lines, "records.")
>> >
>> >
>> >
>> >--
>> >word of the year: *kakistocracy*
>> >_______________________________________________
>> >sqlite-users mailing list
>> >sqlite-users@mailinglists.sqlite.org
>> >http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-
>users
>>
>>
>>
>> _______________________________________________
>> sqlite-users mailing list
>> sqlite-users@mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-
>users
>>
>_______________________________________________
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Simple read-only program very slow

Reply via email to