Re: [sqlite] Using the static lib of sqlite3 under 64 bit ubuntu

2009-03-29 Thread Dan

On Mar 29, 2009, at 10:01 PM, Danny De Keuleneire wrote:

> Settings:
>
> Gcc 4.3.3
> Added usr/lib64/libsqlite3.a
>
> Error:
>
> Undefined reference to 'pthread_mutex_trylock'
>
> What I am missing?

  -lpthread -ldl

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] creating unique data takes many hours, help

2009-03-29 Thread Griggs, Donald
Hi Michael,


Two thoughts -- and I hope others will chime in if I'm off-base here:


1) Build just one index after the import:

>From page:
http://www.sqlite.org/lang_createtable.html
"The UNIQUE constraint causes an unique index to be created on the
specified columns."

I think that if you then create your own index, it will be redundant,
and, with millions of records, time-consuming.

However, you may well want to KEEP your own index (and add the UNIQUE
constrait to it), and *remove* the UNIQUE constraint on the table column
definition.   I hear that creating the index after the data has been
imported is faster (and less fragmented) than creating an implicit one
on-the-fly during import.



2) Hashing

If Vinnie [thev...@yahoo.com] was correct in guessing your goal (to
determine if a given string has been examined before), and if
performance remains a problem, I wondered if the design might benefit
from using hashing to "pre-qualify" a string and/or substitute for a
direct match.

A short hash (perhaps a 64-bit integer?) could be stored in a separate
table or even a separate database from the strings themselves.  If the
hash of a new string does not match any hash in the database, then you
know there is no need to search for the string itself.  If the entire
hash index can be kept in RAM, you might get a big benefit.

You could decide to take this a step further by using a longer hash as a
proxy for the string itself.  The hash would need to be long enough to
make collisions extremely unlikely, yet not so long as to negate the
value of using a proxy for the original strings.  In practice, you'd
probably want to compute a single long hash for each string, then use
the first X bytes as the "pre-qualifying" hash, and the remainder as the
"confirming hash." If your average string length is short, then using
*two* hashes may not be much of a gain. 

(Of course, since this is not a cryptographic application, you don't
need to worry about whether your chosen hash algorithm is "secure" or
not.)

Regards,
   Donald



-Original Message-
From: sqlite-users-boun...@sqlite.org
[mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Vinnie
Sent: Sunday, March 29, 2009 6:14 PM
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] creating unique data takes many hours, help


Michael:

While my answer isn't strictly limited to sqlite, the performance nut in
me can't help myself. You have a a table with only one column, a string.
I could be wrong here but it seems like you just want to keep a list of
values that you have already tried. After you insert a bunch of strings
into the table you want to be able to quickly look up a string to see if
it exists, so that you can tell if you already probed that sequence
(taking a guess here).

If my guess is right, and the only thing you are doing is looking up
sorted single-column elements, you probably can get away with your own
quick disk-based binary tree implementation and avoid sqlite for this
particular circumstance altogether. The result would be several orders
of magnitude faster, even after you have followed the suggestions others
have given.




> Hi,
> 
> I am new with sqlite, and I create a program that reads several mllion

> records and puts them into a sqlite db using.
> 
> The table has one column ONLY indexed and unique, but it takes many 
> hours.
> 
> Is it necessary to pre-allocate the space, or is the anything that I 
> can do to reduce the time it takes.
> 
> this is how I create the db, table and index.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] creating unique data takes many hours, help

2009-03-29 Thread Vinnie

Michael:

While my answer isn't strictly limited to sqlite, the performance nut in me 
can't help myself. You have a a table with only one column, a string. I could 
be wrong here but it seems like you just want to keep a list of values that you 
have already tried. After you insert a bunch of strings into the table you want 
to be able to quickly look up a string to see if it exists, so that you can 
tell if you already probed that sequence (taking a guess here).

If my guess is right, and the only thing you are doing is looking up sorted 
single-column elements, you probably can get away with your own quick 
disk-based binary tree implementation and avoid sqlite for this particular 
circumstance altogether. The result would be several orders of magnitude 
faster, even after you have followed the suggestions others have given.




> Hi,
> 
> I am new with sqlite, and I create a program that reads
> several mllion
> records and puts them into a sqlite db using.
> 
> The table has one column ONLY indexed and unique, but it
> takes many hours.
> 
> Is it necessary to pre-allocate the space, or is the
> anything that I can
> do to reduce the time it takes.
> 
> this is how I create the db, table and index.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] creating unique data takes many hours, help

2009-03-29 Thread Jim Wilcoxson
Yes, you're right, but if the data is already in index order, you'll
do less I/O when creating the index.  Whether the sort + create DB
time is less than "create DB from random input" time is another
question.

Jim

On 3/29/09, mrobi...@cs.fiu.edu  wrote:

> question:
> When you say
>> 2)   Sort your input file on the PRIMARY KEY
>>  or on some other INDEX
>
> I thought that while sqlite inserts the data it is creating the indexes
> therefore sorting the data by way of the index, is this corret?
> I have decared one column ONLY, unique and indexed.
-- 
Software first.  Software lasts!
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] creating unique data takes many hours, help

2009-03-29 Thread mrobi002
Thanks very much.

I will try different values for the loops until a get an optimal number,
and I will try  the "PRAGMA cache_size = <#pages>". command.

question:
When you say
> 2)Sort your input file on the PRIMARY KEY
>   or on some other INDEX

I thought that while sqlite inserts the data it is creating the indexes
therefore sorting the data by way of the index, is this corret?
I have decared one column ONLY, unique and indexed.

Thanks again,

Michael





> On Sun, 29 Mar 2009 15:19:00 -0400 (EDT),
> mrobi...@cs.fiu.edu wrote:
>
>>Hi,
>>
>>I am new with sqlite, and I create a program that reads several mllion
>>records and puts them into a sqlite db using.
>>
>>The table has one column ONLY indexed and unique, but it takes many
>> hours.
>>
>>Is it necessary to pre-allocate the space, or is the anything that I can
>>do to reduce the time it takes.
>>
>>this is how I create the db, table and index.
>>
>>void openSqliteFile()
>>{
>>rc = sqlite3_open(genome_probesDB, ); //if it !exist creates it
>>
>>if (rc == SQLITE_OK) {
>>   printf("RC=%d database=%s was opened\n", rc, genome_probesDB );
>>}
>>else {
>>   printf("RC=%d database=%s COULD NOT OPEN\n", rc, genome_probesDB
>> );
>>}
>>
>>rc = sqlite3_exec(db, "CREATE TABLE probes (probe STRING unique);",
>>NULL, NULL, );
>>if (rc == SQLITE_OK) {
>>   printf("RC=%d table probes with field probe was created\n", rc );
>>}
>>else {
>>   printf("RC=%d table %s already exists, so it was NOT created\n",
>>rc, genome_probesDB );
>>}
>>
>>rc = sqlite3_exec(db, "CREATE INDEX probe on probes (probe);",  NULL,
>>NULL, );
>>if (rc == SQLITE_OK) {
>>   printf("RC=%d INDEX probe on table probes for field probe was
>>created\n", rc );
>>}
>>else {
>>printf("RC=%d INDEX probe on table %s already exists, so it was NOT
>>created\n", rc, genome_probesDB );
>>}
>>
>>
>>}//end void openSqliteFile()
>>
>>
>>and this is how I add the data:
>>char *zSQL = sqlite3_mprintf("INSERT INTO probes VALUES(%Q)", probe);
>>
>>sqlite3_exec(db, zSQL, 0, 0, 0);
>>sqlite3_free(zSQL);
>
> Two common optimizations:
>
> 1)Wrap the INSERT statements in a transaction
>   while not EOF on input file
>   BEGIN
>   loop 1 times or EOF
>   read input record
>   INSERT
>   endloop
>   COMMIT
>   endwhile
>
> 2)Sort your input file on the PRIMARY KEY
>   or on some other INDEX
>
>>Thanks very much
>>
>>Michael
> --
>   (  Kees Nuyt
>   )
> c[_]
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] creating unique data takes many hours, help

2009-03-29 Thread Kees Nuyt
On Sun, 29 Mar 2009 15:19:00 -0400 (EDT),
mrobi...@cs.fiu.edu wrote:

>Hi,
>
>I am new with sqlite, and I create a program that reads several mllion
>records and puts them into a sqlite db using.
>
>The table has one column ONLY indexed and unique, but it takes many hours.
>
>Is it necessary to pre-allocate the space, or is the anything that I can
>do to reduce the time it takes.
>
>this is how I create the db, table and index.
>
>void openSqliteFile()
>{
>rc = sqlite3_open(genome_probesDB, ); //if it !exist creates it
>
>if (rc == SQLITE_OK) {
>   printf("RC=%d database=%s was opened\n", rc, genome_probesDB );
>}
>else {
>   printf("RC=%d database=%s COULD NOT OPEN\n", rc, genome_probesDB );
>}
>
>rc = sqlite3_exec(db, "CREATE TABLE probes (probe STRING unique);", 
>NULL, NULL, );
>if (rc == SQLITE_OK) {
>   printf("RC=%d table probes with field probe was created\n", rc );
>}
>else {
>   printf("RC=%d table %s already exists, so it was NOT created\n",
>rc, genome_probesDB );
>}
>
>rc = sqlite3_exec(db, "CREATE INDEX probe on probes (probe);",  NULL,
>NULL, );
>if (rc == SQLITE_OK) {
>   printf("RC=%d INDEX probe on table probes for field probe was
>created\n", rc );
>}
>else {
>printf("RC=%d INDEX probe on table %s already exists, so it was NOT
>created\n", rc, genome_probesDB );
>}
>
>
>}//end void openSqliteFile()
>
>
>and this is how I add the data:
>char *zSQL = sqlite3_mprintf("INSERT INTO probes VALUES(%Q)", probe);
>
>sqlite3_exec(db, zSQL, 0, 0, 0);
>sqlite3_free(zSQL);

Two common optimizations:

1)  Wrap the INSERT statements in a transaction
while not EOF on input file
BEGIN
loop 1 times or EOF
read input record
INSERT 
endloop
COMMIT
endwhile

2)  Sort your input file on the PRIMARY KEY 
or on some other INDEX

>Thanks very much
>
>Michael
-- 
  (  Kees Nuyt
  )
c[_]
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Using the static lib of sqlite3 under 64 bit ubuntu

2009-03-29 Thread Danny De Keuleneire
Settings:

Gcc 4.3.3
Added usr/lib64/libsqlite3.a

Error:

Undefined reference to 'pthread_mutex_trylock'

What I am missing?

If I use the so shared lib no problems at all.

Tx,

Danny

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] select by min di

2009-03-29 Thread Igor Tandetnik
"baxy77bax"  wrote in
message news:22767988.p...@talk.nabble.com
> hi ,
> i need help with this one : i have a table with 2 columns C1 & C2:
>
> C1   C2
>
> a  1
> b  1
> b  2
> b  4
> c  3
> v  2
> c  5
> d  5
>
> and i need to get this:
>
> C1  C2
>
> a  1
> b  1
> c  3
> d  5
> v  2
>
> so a unique set of data in column C1 where C2 is the min id for that
> data ?

select C1, min(C2) from mytable
group by C1;

Igor Tandetnik 



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] select by min di

2009-03-29 Thread baxy77bax

hi , 
i need help with this one : i have a table with 2 columns C1 & C2:

C1   C2

a  1
b  1
b  2
b  4
c  3
v  2
c  5
d  5

and i need to get this:

C1  C2
 
a  1
b  1
c  3
d  5
v  2

so a unique set of data in column C1 where C2 is the min id for that data ?


thank you !


-- 
View this message in context: 
http://www.nabble.com/select-by-min-di-tp22767988p22767988.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] sequential row numbers from query

2009-03-29 Thread Adler, Eliedaat
Possible limitations of temporary tables vs views:

1) Temporary tables are static - views are dynamic -
i.e. anytime a change is made to the database the temporary tables might need 
to be recreated.

2) Memory: Temporary tables will be retained in memory until dropped. Views 
will only use memory when used.

Eli Adler


-Original Message-
From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] 
On Behalf Of Robert Citek
Sent: Saturday, March 28, 2009 12:03 AM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] sequential row numbers from query

That would work.  In fact, my current solution, which actually pipes to perl, 
works pretty well.  It's just that I have to then import the data back into the 
database.  So, I'd prefer to do the process entirely in SQL.

I was thinking maybe a view, but that didn't work.  Apparently, there is no 
rowid with views (or is there something equivalent that I'm overlooking?).

$ sqlite3 db 'create view bar as select * from foo order by field desc;
select rowid, * from bar ; '
|c
|b
|a

Substituting a temporary table for the view works:

$ sqlite3 db 'create temporary table bat as select * from foo order by field 
desc;
select rowid, * from bat ; '
1|c
2|b
3|a

Any limitations to consider when using a temporary table?  Any other ideas?

Regards,
- Robert

On Fri, Mar 27, 2009 at 3:37 PM, Thomas Briggs  wrote:
>   Holy cow that feels inefficient.
>
>   It's a bit clunky, but why not insert into a temporary table,
> ordered as desired, and then use the rowid from the temp table?
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

This e-mail is confidential, the property of NDS Ltd and intended for the 
addressee only. Any dissemination, copying or distribution of this message or 
any attachments by anyone other than the intended recipient is strictly 
prohibited. If you have received this message in error, please immediately 
notify the postmas...@nds.com and destroy the original message. Messages sent 
to and from NDS may be monitored. NDS cannot guarantee any message delivery 
method is secure or error-free. Information could be intercepted, corrupted, 
lost, destroyed, arrive late or incomplete, or contain viruses. We do not 
accept responsibility for any errors or omissions in this message and/or 
attachment that arise as a result of transmission. You should carry out your 
own virus checks before opening any attachment. Any views or opinions presented 
are solely those of the author and do not necessarily represent those of NDS.

To protect the environment please do not print this e-mail unless necessary.

NDS Limited Registered Office: One London Road, Staines,Middlesex TW18 4EX, 
United Kingdom. A company registered in England and Wales Registered no. 
3080780 VAT no. GB 603 8808 40-00
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users