Re: [sqlite] Performance & database design

2006-03-22 Thread Jim Dodgen
Back in my past I used Oracle OCI and did "array" inserts where you 
would load an array for
each column to be inserted. bind the arrays to the insert statement and 
then do a big insert.

It was a quite fast way to load data.

Joe Wilson wrote:
Some people on the list have noted that inserting pre-sorted 
rows in sub-batches into SQLite is faster than inserting 
unsorted rows. Granted, you could only do this for one index 
per table, but might this be the basis of an optimization?
(I have not looked at the insert code. Perhaps SQLite is 
already doing this for all I know.)


Would this be a bad time to ask for multi-row insert support? :-)
INSERT INTO tbl_name(a,b,c) VALUES(1,2,3),(4,5,6),(7,8,9);

  

The way indices work in SQLite is that there is one row
in the index for each row in the table but the index rows
are in index order.  If the indexed values are randomly
distributed in the table, that means building the index
requires inserting each row in a random spot in the middle
of the index.  Constantly inserting things in random places
means that there is no locality of reference and the pager
cache does not perform well.  It is not clear to me what
can be done about this other than to completely redesign
how indices work.  And even then, I cannot think of an
alternative design that would do much better.




__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 



  




[sqlite] retreving the column names for a query (PERL)

2006-03-22 Thread Jim Dodgen

for a query like

select * from a join b on a.x = b.z

anyone know how to get all the column names of the fields that would be 
returned from the query?


I am using the DBD::SQLite PERL module

Thanks

Jim


Re: Re[2]: [sqlite] Performance & database design

2006-03-22 Thread Jay Sprenkle
On 3/22/06, Teg <[EMAIL PROTECTED]> wrote:
> Hello Jay,
>
> Best way I've found to get great performance out of strings and
> vectors is to re-use the strings and vectors. String creation speed is
> completely dependent on allocation speed so, by re-using the strings,
> you only grow the ones that aren't already big enough to hold the new
> string data so, eventually they don't grow at all.
>
> At least with STLPort STL which I use, a "clear" doesn't de-allocate
> the space then re-use simply fills in the already allocated space.
>
> With re-use, speeds are easily as fast as straight C after an initial
> setup time.

Ah. There is a method to set the size of the vector if you know what it will be
in advance. It's a great deal faster not to pay for all the repeated
reallocation
and memory movement if you can code it that way.


Re[2]: [sqlite] Performance & database design

2006-03-22 Thread Teg
Hello Jay,

Best way I've found to get great performance out of strings and
vectors is to re-use the strings and vectors. String creation speed is
completely dependent on allocation speed so, by re-using the strings,
you only grow the ones that aren't already big enough to hold the new
string data so, eventually they don't grow at all.

At least with STLPort STL which I use, a "clear" doesn't de-allocate
the space then re-use simply fills in the already allocated space.

With re-use, speeds are easily as fast as straight C after an initial
setup time.


C

Wednesday, March 22, 2006, 6:20:07 PM, you wrote:

JS> On 3/22/06, Micha Bieber
JS> <[EMAIL PROTECTED]> wrote:
>> Eventually, I've got my lesson. Because it might be of some interest for
>> the beginner:
>>
>> 1)Use the associated sqlite3_bind_* variants for your data.
>> I did make a mistake in converting forth and back to strings beforehand.
>>
>> 2)It broke my program design a bit, but setting up large STL vector
>> based C++ string records (especially using push_back and stringstreams
>> for conversions) turned out to be a show stopper on some platforms. The
>> plan was, to feed them to sqlite_bind* later on (with the additional
>> performance drop caused by exclusively using sqlite3_bind_text mentioned
>> above). Setting up the structures took more time than the asynchronous
>> writing to the database.

JS> Huh. I'll have to do some testing. I thought vector was pretty speedy.
JS> Thanks for posting your results!



-- 
Best regards,
 Tegmailto:[EMAIL PROTECTED]



Re: [sqlite] Re: Select with left outer join - Am I going mad or just having a bad day?

2006-03-22 Thread Ed Porter
Hi Ian,

This one has been interesting! I'm trying to repeat the problem with
several different databases - so far no luck. I use SQLite 2.8xx (project
is already ongoing) and will test when I have time. I will post results if
I find anything of value.




At 06:41 PM 3/21/06 +, you wrote:
>
>On 21 Mar 2006, at 18:11, Ed Porter wrote:
>
>> Hi Ian,
>>
>> You nee to use a Full Outer Join. I don't know if SQLite has this  
>> function.
>>
>>
>
>Nah, definitely only wanted a left outer, didn't want the results  
>multiplied up or anything, just wanted every Category and any Cases  
>if matching the criteria.
>
>It was solved by moving the where clause up into an AND on the join,  
>seems that the where clause choking the result set from the outer  
>join. Never seen this before, but then again I've generally used "old  
>style" syntax joins (in the where clause), not ANSI, so I'm a bit new  
>to how the where clause can effect the results of an ANSI join. But  
>I'm learning!
>
>Thanks for your response though Ed, much appreciated.
>
>Regards,
>--
>Ian M. Jones
>___
>IMiJ Software
>http://www.imijsoft.com
>http://www.ianmjones.net (blog)
>
>
>
Sincerely,

Ed Porter


Re: [sqlite] Performance & database design

2006-03-22 Thread Jay Sprenkle
On 3/22/06, Micha Bieber <[EMAIL PROTECTED]> wrote:
> Eventually, I've got my lesson. Because it might be of some interest for
> the beginner:
>
> 1)Use the associated sqlite3_bind_* variants for your data.
> I did make a mistake in converting forth and back to strings beforehand.
>
> 2)It broke my program design a bit, but setting up large STL vector
> based C++ string records (especially using push_back and stringstreams
> for conversions) turned out to be a show stopper on some platforms. The
> plan was, to feed them to sqlite_bind* later on (with the additional
> performance drop caused by exclusively using sqlite3_bind_text mentioned
> above). Setting up the structures took more time than the asynchronous
> writing to the database.

Huh. I'll have to do some testing. I thought vector was pretty speedy.
Thanks for posting your results!


Re: [sqlite] Scrolling thru an index

2006-03-22 Thread Jay Sprenkle
> > My application is geared towards users who want to find a specific name
> > in a list of names, and then want to have the possibility to scroll
> > backwards or forwards.  For example, if I search for "Sprenkle" I want
> > to show the user a window with "Sprenkle" in the middle, preceded by the
> > 50 names before it, and followed by the 50 names after it, and also to
> > be able to smoothly scroll in either direction.
> >
> > I know the index contains sufficient data to do this, but there seems to
> > be no way to use it from SQLite.

Get it in two chunks,

the first 100 names after the name in question:
select x from mytable where Name > 'sprenkle' limit 100

and the 100 names before the name in question:
select x from mytable where Name < 'sprenkle' limit 100 order by x desc


RE: [sqlite] Vacuum slow

2006-03-22 Thread Robert Simpson
When you issue the VACUUM statement, the OS ends up loading a lot of the
data from the database into its disk cache.  Since you're running the select
right afterwards, then SQLite ends up loading the pages from the underlying
OS cache, so ya its going to *appear* faster than if you had issued an
autovacuum.

Robert


> -Original Message-
> From: Tobias_Rundström [mailto:[EMAIL PROTECTED] 
> Sent: Wednesday, March 22, 2006 3:01 PM
> To: sqlite-users@sqlite.org
> Subject: Re: [sqlite] Vacuum slow
> 
> 
> El 02-03-2006, a las 13:35, [EMAIL PROTECTED] escribió:
> > The VACUUM command does something very much like this:
> >
> > sqlite3 olddb .dump | sqlite3 newdb; mv newdb olddb
> >
> > I say "much like" the above because there are some important 
> > differences.  The VACUUM command transfers the data from 
> the old and 
> > new database in binary without having to convert it into text.  And 
> > the whole operation is protected by a rollback journal so that if a 
> > power failure occurs in the middle, the operation will rollback.
> >
> > But the point is this:  VACUUM recreates the whole database from 
> > scratch.  The time required is proportional to the amount 
> of data that 
> > is in the database.
> 
> I have noticed that when I import a lot of data (40-50k rows) 
> into my application and run a select (fairly complex with 
> joins) afterwards it will be a bit slow (fragmented indexes 
> or something), but rerunning the select after a vacuum; will 
> *GREATLY* speed up matters.  
> This doesn't seem to happen if I set the autovacuum flag, 
> vacuum still seems to do something different.
> 
> Any ideas?
> 
> -- Tobia
=0



Re: [sqlite] Vacuum slow

2006-03-22 Thread Tobias Rundström


El 02-03-2006, a las 13:35, [EMAIL PROTECTED] escribió:

The VACUUM command does something very much like this:

sqlite3 olddb .dump | sqlite3 newdb; mv newdb olddb

I say "much like" the above because there are some
important differences.  The VACUUM command transfers
the data from the old and new database in binary
without having to convert it into text.  And the
whole operation is protected by a rollback journal
so that if a power failure occurs in the middle, the
operation will rollback.

But the point is this:  VACUUM recreates the whole
database from scratch.  The time required is proportional
to the amount of data that is in the database.


I have noticed that when I import a lot of data (40-50k rows) into my  
application and run a select (fairly complex with joins) afterwards  
it will be a bit slow (fragmented indexes or something), but  
rerunning the select after a vacuum; will *GREATLY* speed up matters.  
This doesn't seem to happen if I set the autovacuum flag, vacuum  
still seems to do something different.


Any ideas?

-- Tobias

Re: [sqlite] Different column names in callback function on Linux vs. Windows

2006-03-22 Thread Derrell . Lipman
"Iulian Popescu" <[EMAIL PROTECTED]> writes:

> I checked the versions and indeed the one I'm using on Windows is 3.0.8
> whether the one on Linux is 3.1.2. This being said as far as I understand
> and please correct me if I'm wrong the two PRAGMA(s) are just commands you
> run used to modify the operation of the SQLite library. I haven't invoked
> any of them before running the statements I mentioned so I assume the
> default behavior was used. How can I find what this is for the particular
> versions I'm running?

You can find out the current values like this:

   % sqlite3 :memory:
   SQLite version 3.2.1
   Enter ".help" for instructions
   sqlite> pragma full_column_names;
   0
   sqlite> pragma short_column_names;
   1
   sqlite>

but beware that the *meaning* of the pragmas changed over time, so just having
the same values doesn't necessarily mean that the same column names will be
returned by your queries.  (You can try to find out what changes were made
when, using the "timeline" on the sqlite.org website.  I don't recall when the
changes I'm referencing occurred.)

For full compatibility in this area, the easiest method is to just ensure that
you're using the same version of sqlite on both platforms, and then set the
pragma values the same.

Derrell


RE: [sqlite] Different column names in callback function on Linux vs. Windows

2006-03-22 Thread Iulian Popescu
Hi Derrell,

I checked the versions and indeed the one I'm using on Windows is 3.0.8
whether the one on Linux is 3.1.2. This being said as far as I understand
and please correct me if I'm wrong the two PRAGMA(s) are just commands you
run used to modify the operation of the SQLite library. I haven't invoked
any of them before running the statements I mentioned so I assume the
default behavior was used. How can I find what this is for the particular
versions I'm running?

Thanks,
Iulian.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] 
Sent: Wednesday, March 22, 2006 3:20 PM
To: sqlite-users@sqlite.org
Subject: Re: [sqlite] Different column names in callback function on Linux
vs. Windows

"Iulian Popescu" <[EMAIL PROTECTED]> writes:

> I'm doing an application port from Windows to Linux and one of the
> problems I'm facing is when executing the following statement through a
call
> to sqlite3_exec():
>
> SELECT mytable.'mycolumn' FROM table
>
> The registered callback function 4th argument (a char**) denoting the
column
> names contains the string mytable.'mycolumn' on Windows and the string
> mycolumn on Linux. Has anyone any idea why would that be?

I suspect you're using different versions of sqlite on Windows and Linux, or
you have pragma settings set differently on the two OSs.  You can verify the
version of sqlite with "sqlite3 -version".

There have been changes, through the development of sqlite, on what column
names to return.  IIRC, the meanings of

  PRAGMA short_column_names
and
  PRAGMA full_column_names

have changed a couple of times, and these affect exactly the information
that
you're having trouble with.

You should first ensure that you are running the same version of sqlite on
the
to OSs.  Then ensure that the settings of these two pragmas are the same.
With both the version and the pragma settings the same, I believe you should
get the same values passed to the callback function, given the same query.
Windows and Linux portations built from the same source, so should generate
similar results.

Derrell




Re: [sqlite] Scrolling thru an index

2006-03-22 Thread John Stanton

JP wrote:

Jay Sprenkle wrote:


Is there a way I can scroll thru a particular index?  For example:

1. Scroll forward/backward on a given set of records
2. Start at position X
3. Start at a record that matches a criteria




SQL is optimized to manipulate a set of records. It's much faster to 
execute

"update mytable set mycolumn = 0 where mycondition = true"
than to iterate through them and set them to zero individually.
Can you do whatever it is with sql instead of code?

The index is used transparently to locate rows faster.
You don't iterate the index, you iterate the rows.



I am thinking more in terms of the query/read part, than on updates.

My application is geared towards users who want to find a specific name 
in a list of names, and then want to have the possibility to scroll 
backwards or forwards.  For example, if I search for "Sprenkle" I want 
to show the user a window with "Sprenkle" in the middle, preceded by the 
50 names before it, and followed by the 50 names after it, and also to 
be able to smoothly scroll in either direction.


I know the index contains sufficient data to do this, but there seems to 
be no way to use it from SQLite.


I understand that getting the data by "chunks" or "pages" the way SQL 
does is perfect for client/server models.  But, given that SQLite is 
more geared towards standalone/embedded systems, it wouldn't hurt to 
have an extra mechanism for 'local data scrolling', maybe thru standard 
SQL cursors, which it currently doesn't support.


Anyway, maybe separate topic, I tried to create a "snapshot" window of 
the above using plain SQL, but it doesn't seem to work on Sqlite 3.3.4:


CREATE TABLE clients (custid integer primary key, lastname varchar(50));
CREATE INDEX cidx ON (lastname);

(insert 10,000 records here)

SELECT idx,lastname FROM
(
-- get names before search criteria
SELECT '1' as idx,lastname
FROM (select lastname from customers where lastname<'sprenkle'
ORDER BY lastname DESC LIMIT 50)
UNION
-- get names matching search criteria plus 50 more
SELECT '2' as idx,lastname
FROM (select lastname from customers where lastname>='sprenkle'
ORDER BY lastname ASC LIMIT 50)
)
order by 1,2;

Individually, the queries work fine.  In UNION, each seems to lose the 
inner order clause and show innacurate results.


jp

I understand what you are trying to do.  It was a very common way of 
using index files for lookups.  You could try using SQL to get N rows 
before your name and N rows after using LIMIT N, then moving this window 
as the user scrolls.


Re: [sqlite] Performance & database design

2006-03-22 Thread Micha Bieber
Eventually, I've got my lesson. Because it might be of some interest for
the beginner:

1)Use the associated sqlite3_bind_* variants for your data.
I did make a mistake in converting forth and back to strings beforehand.

2)It broke my program design a bit, but setting up large STL vector
based C++ string records (especially using push_back and stringstreams
for conversions) turned out to be a show stopper on some platforms. The
plan was, to feed them to sqlite_bind* later on (with the additional
performance drop caused by exclusively using sqlite3_bind_text mentioned
above). Setting up the structures took more time than the asynchronous
writing to the database.

This way my 40 minutes per 25M rows dropped to 8 minutes, something I'm
completely happy with.

I'm sure, many things are quite standard for the experts here, but
for the beginner it might be useful to know. I have seen a Wiki item,
dealing with performance. Would it be possible to give some emphasis to
the problem in the standard documentation too ?

Thank you for the software, Richard (et al ?).
I like the library a lot.

Micha  
-- 



Re: [sqlite] Different column names in callback function on Linux vs. Windows

2006-03-22 Thread Derrell . Lipman
"Iulian Popescu" <[EMAIL PROTECTED]> writes:

> I'm doing an application port from Windows to Linux and one of the
> problems I'm facing is when executing the following statement through a call
> to sqlite3_exec():
>
> SELECT mytable.'mycolumn' FROM table
>
> The registered callback function 4th argument (a char**) denoting the column
> names contains the string mytable.'mycolumn' on Windows and the string
> mycolumn on Linux. Has anyone any idea why would that be?

I suspect you're using different versions of sqlite on Windows and Linux, or
you have pragma settings set differently on the two OSs.  You can verify the
version of sqlite with "sqlite3 -version".

There have been changes, through the development of sqlite, on what column
names to return.  IIRC, the meanings of

  PRAGMA short_column_names
and
  PRAGMA full_column_names

have changed a couple of times, and these affect exactly the information that
you're having trouble with.

You should first ensure that you are running the same version of sqlite on the
to OSs.  Then ensure that the settings of these two pragmas are the same.
With both the version and the pragma settings the same, I believe you should
get the same values passed to the callback function, given the same query.
Windows and Linux portations built from the same source, so should generate
similar results.

Derrell


Re: [sqlite] Re: concers about database size

2006-03-22 Thread Daniel Franke
> > This may take a while, about 20 hours maybe. The partition has approx
> > 10GB, I can't afford more. Let's hope that this is sufficient.
>
> 20 hours seems rather long. Even if you have to worry about uniqueness
> constraints, there are ways to deal with that that should be much faster
> (deal with the data in chunks, load into temp tables, check for dupes,
> etc).

It is longer than necessary, that's true. I implemented it the way I did due 
to internal loggin purposes. I'm not yet convinced that those logging 
capabilities are worth the delay. I'm still in testing mode ;)


> I can tell you that even 750M rows wouldn't be a huge deal for PostgreSQL,
> and 20G of data is nothing. Though your table would take somewhere
> around 30G due to the higher per-row overhead in PostgreSQL; I'm not
> really sure how large the indexes would be.

AFAIK, PostgreSQL is implemented in a client-server architecture. 
For maintainability, I try to avoid such a thing.


> As for performance, I haven't seen a single mention of any kind of
> metrics you'd like to hit, so it's impossible to guess as to whether
> SQLite, PostgreSQL, or anything else would suffice. 

I posted a couple of timings a few days ago. As far as I can tell, the 
performance of sqlite will suffice for my tasks, even if run on usual pc 
hardware =)


> As for partitioning, you might still have a win if you can identify some
> common groupings, and partition based on that. Even if you can't, you
> could at least get a win on single-person queries.

The data could easily be grouped by chromosome, but I would like to avoid 
this, too. I expect, it'd be sort of an hassle to do multi-chromosome 
queries.


Thanks for your input, nevertheless!

Regards
Daniel



Re: [sqlite] Scrolling thru an index

2006-03-22 Thread JP

Ulrik Petersen wrote:

Hi JP,

JP wrote:

Anyway, maybe separate topic, I tried to create a "snapshot" window of 
the above using plain SQL, but it doesn't seem to work on Sqlite 3.3.4:


CREATE TABLE clients (custid integer primary key, lastname varchar(50));
CREATE INDEX cidx ON (lastname);

(insert 10,000 records here)

SELECT idx,lastname FROM
(
-- get names before search criteria
SELECT '1' as idx,lastname
FROM (select lastname from customers where lastname<'sprenkle'
ORDER BY lastname DESC LIMIT 50)
UNION
-- get names matching search criteria plus 50 more
SELECT '2' as idx,lastname
FROM (select lastname from customers where lastname>='sprenkle'
ORDER BY lastname ASC LIMIT 50)
)
order by 1,2;

Individually, the queries work fine.  In UNION, each seems to lose the 
inner order clause and show innacurate results.


jp



Instead of "order by 1,2", don't you mean "order by idx"?

Ulrik P.


no, order by 1,2 is equivalent to "idx,lastname". Maybe I should have 
used 'a' and 'b' to make things more clear in the SELECT statements.


jp.


Re: [sqlite] Scrolling thru an index

2006-03-22 Thread Ulrik Petersen

Hi JP,

JP wrote:

Anyway, maybe separate topic, I tried to create a "snapshot" window of 
the above using plain SQL, but it doesn't seem to work on Sqlite 3.3.4:


CREATE TABLE clients (custid integer primary key, lastname varchar(50));
CREATE INDEX cidx ON (lastname);

(insert 10,000 records here)

SELECT idx,lastname FROM
(
-- get names before search criteria
SELECT '1' as idx,lastname
FROM (select lastname from customers where lastname<'sprenkle'
ORDER BY lastname DESC LIMIT 50)
UNION
-- get names matching search criteria plus 50 more
SELECT '2' as idx,lastname
FROM (select lastname from customers where lastname>='sprenkle'
ORDER BY lastname ASC LIMIT 50)
)
order by 1,2;

Individually, the queries work fine.  In UNION, each seems to lose the 
inner order clause and show innacurate results.


jp



Instead of "order by 1,2", don't you mean "order by idx"?

Ulrik P.



Re: [sqlite] Scrolling thru an index

2006-03-22 Thread JP

Jay Sprenkle wrote:

Is there a way I can scroll thru a particular index?  For example:

1. Scroll forward/backward on a given set of records
2. Start at position X
3. Start at a record that matches a criteria



SQL is optimized to manipulate a set of records. It's much faster to execute
"update mytable set mycolumn = 0 where mycondition = true"
than to iterate through them and set them to zero individually.
Can you do whatever it is with sql instead of code?

The index is used transparently to locate rows faster.
You don't iterate the index, you iterate the rows.


I am thinking more in terms of the query/read part, than on updates.

My application is geared towards users who want to find a specific name 
in a list of names, and then want to have the possibility to scroll 
backwards or forwards.  For example, if I search for "Sprenkle" I want 
to show the user a window with "Sprenkle" in the middle, preceded by the 
50 names before it, and followed by the 50 names after it, and also to 
be able to smoothly scroll in either direction.


I know the index contains sufficient data to do this, but there seems to 
be no way to use it from SQLite.


I understand that getting the data by "chunks" or "pages" the way SQL 
does is perfect for client/server models.  But, given that SQLite is 
more geared towards standalone/embedded systems, it wouldn't hurt to 
have an extra mechanism for 'local data scrolling', maybe thru standard 
SQL cursors, which it currently doesn't support.


Anyway, maybe separate topic, I tried to create a "snapshot" window of 
the above using plain SQL, but it doesn't seem to work on Sqlite 3.3.4:


CREATE TABLE clients (custid integer primary key, lastname varchar(50));
CREATE INDEX cidx ON (lastname);

(insert 10,000 records here)

SELECT idx,lastname FROM
(
-- get names before search criteria
SELECT '1' as idx,lastname
FROM (select lastname from customers where lastname<'sprenkle'
ORDER BY lastname DESC LIMIT 50)
UNION
-- get names matching search criteria plus 50 more
SELECT '2' as idx,lastname
FROM (select lastname from customers where lastname>='sprenkle'
ORDER BY lastname ASC LIMIT 50)
)
order by 1,2;

Individually, the queries work fine.  In UNION, each seems to lose the 
inner order clause and show innacurate results.


jp



Re: [sqlite] Re: concers about database size

2006-03-22 Thread Jim C. Nasby
On Thu, Mar 16, 2006 at 09:53:27PM +0100, Daniel Franke wrote:
> 
> > That would be an excellent question to add to the FAQ:
> > "How do I estimate the resource requirements for a database?"
> 
> I spent some time to create 3GB of sample data (just zeros, about half the 
> size of the actual data set I have to deal with). I'm currently importing it 
> into the database. As far as I already can tell: the main index will be 
> approx 4GB in size. AFAIK, 32bit systems never grant more than 2GB RAM per 
> process ... 
> 
> This may take a while, about 20 hours maybe. The partition has approx 10GB, I 
> can't afford more. Let's hope that this is sufficient.

20 hours seems rather long. Even if you have to worry about uniqueness
constraints, there are ways to deal with that that should be much faster
(deal with the data in chunks, load into temp tables, check for dupes,
etc).

I've not used SQLite, so I can't really speak to it's capabilities. I
can tell you that even 750M rows wouldn't be a huge deal for PostgreSQL,
and 20G of data is nothing. Though your table would take somewhere
around 30G due to the higher per-row overhead in PostgreSQL; I'm not
really sure how large the indexes would be.

As for performance, I haven't seen a single mention of any kind of
metrics you'd like to hit, so it's impossible to guess as to whether
SQLite, PostgreSQL, or anything else would suffice. I can give you this
metric though: the main table behind http://stats.distributed.net has
134M rows and I can do a group-by count on it in 188 seconds (and that's
after blowing out the cache). This table is wider than yours:

   Table "public.email_contrib"
   Column   |  Type   | Modifiers 
+-+---
 project_id | integer | not null
 id | integer | not null
 date   | date| not null
 team_id| integer | 
 work_units | bigint  | not null

select project_id, count(*) from email_contrib group by project_id;
 project_id |  count
+--
  5 | 56427141
205 | 58512516
  3 |   336550
 25 |  6756695
  8 | 11975041
 24 |   626317

It takes up about 8G on disk.

I can also do index-scans fairly fast.
http://cvs.distributed.net/viewcvs.cgi/stats-proc/daily/audit.sql?annotate=1.45
is an auditing script that (among other things) scans through every
record in email_contrib for a particular project_id, while joining to
another table (see the code starting at line 170). It takes about 70
seconds to do this for project_id 8 or 25. All this is happening on a
dual Opteron (242, I think) with the OS and transaction logs on a 2
SATA drive mirror and the data stored on a 6 SATA drive RAID 10. The
machine has 4G of memory. This certainly isn't what I'd consider to be
'big expensive hardware'.

As for partitioning, you might still have a win if you can identify some
common groupings, and partition based on that. Even if you can't, you
could at least get a win on single-person queries.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461


Re: [sqlite] Has anyone used sqlite for Pocket PC development?

2006-03-22 Thread Brad
core source code supports it.  The provider I wrote that Brad mentions 
is

for VS2005 and .NET 2.0, but there does exist a .NET 1.1 provider from
Finisar: http://sourceforge.net/projects/adodotnetsqlite
I can't vouch for its performance since I've never actually used it.


I've used the Finisar one on the desktop in the past, and it works quite 
well there.  I've never used the CF version of it, though.  Since moving 
my own projects to .NET 2.0, I haven't used it at all.




RE: [sqlite] Has anyone used sqlite for Pocket PC development?

2006-03-22 Thread Denis Sbragion
Hello Robert,

On Wed, March 22, 2006 15:32, Robert Simpson wrote:
...
> I can't vouch for its performance since I've never actually used it.

we're using SQLite under WinCE since version 2.1.7, with excellent
performances. We have been able to handle database with more than 10
records and up to more than 15 Mb without much problems. Most of the times the
programming language is the bottleneck, with queries executed in no time and
most of the computing time spent filling grids, combo boxes and so on. SQLite
rocks also when it comes to reliability, with almost no data loss despite the
problematic mobile environment.

We haven't tried the recent 3.X version yet, but 2.8 works like a charm, both
with the old Embedded Visual Tools environment and the newer .NET Compact
Framework. BTW we're not using the .NET data provider but direct calls to a
DLL that wraps SQLite so that it is usable from the development languages we
use.

Bye,

-- 
Denis Sbragion
InfoTecna
Tel: +39 0362 805396, Fax: +39 0362 805404
URL: http://www.infotecna.it



Re: [sqlite] Scrolling thru an index

2006-03-22 Thread Jay Sprenkle
> Is there a way I can scroll thru a particular index?  For example:
>
> 1. Scroll forward/backward on a given set of records
> 2. Start at position X
> 3. Start at a record that matches a criteria

SQL is optimized to manipulate a set of records. It's much faster to execute
"update mytable set mycolumn = 0 where mycondition = true"
than to iterate through them and set them to zero individually.
Can you do whatever it is with sql instead of code?

The index is used transparently to locate rows faster.
You don't iterate the index, you iterate the rows.


Re: [sqlite] Has anyone used sqlite for Pocket PC development?

2006-03-22 Thread Brad

I am planning to use sqlite with VS .Net 2003 Smart device C#
application.  Just wondering if anyone has blazed down this path
before and has any insights to share.


The big thing to remember is that you are not programming for a desktop 
device, nor even a laptop.  If you can limit the use of the keyboard, 
you should.  Don't try to do data entry, unless you have another way to 
do it, such as a barcode scanner.


Check out http://sqlite.phxsoftware.com/ for a SQLite provider for .NET 
and the CF.


For some reading, I picked up this book last year, and it has some 
really good information in it.  There are a number of other books 
available as well.

http://www.amazon.com/gp/product/0735617252/sr=8-2/qid=1143033346/ref=pd_bbs_2/104-2336543-8207937?%5Fencoding=UTF8


This one also looks pretty good.
http://www.amazon.com/gp/product/0321174038/sr=8-1/qid=1143033346/ref=pd_bbs_1/104-2336543-8207937?%5Fencoding=UTF8 



RE: [sqlite] R: [sqlite] support for table partitioning?

2006-03-22 Thread Thomas Briggs

   What you've described here is column partitioning - most databases implement 
row partitioning, where the rows in the table are split between multiple, 
hidden sub-tables based on the value(s) in one or more columns within the row.  
The most common application of which is separating date-based data - 
transparently keeping data for each month in its own table, for example.

   And in my ever so humble opinion, this would be a great thing for SQLite to 
have - anything that allows the database to be more selective about what rows 
it reads to satisfy a particular query is a good thing.

   Also, for the record, beware of the word partitioning - it means different 
things to different databases.  For Oracle and SQL Server 2005 partitioning 
means table partition, while for DB2 it means database partitioning (a la 
Teradata).

   -Tom

> -Original Message-
> From: Zibetti Paolo [mailto:[EMAIL PROTECTED] 
> Sent: Tuesday, March 21, 2006 9:22 AM
> To: 'sqlite-users@sqlite.org'
> Subject: [sqlite] R: [sqlite] support for table partitioning?
> 
> > The database already knows exactly where to look for each 
> table when all
> the tables are in the same file.  
> > All it has to do is "lseek()" to the appropriate spot.  How 
> does moving
> tables into separate files help this or make it any faster?
> >
> 
> "Table partitioning" is a technique used to improve 
> performance of large
> databases running on large machines.
> With table partitioning you can configure the DB so that it 
> stores some
> fields of a record in a file and the remaining fields of the 
> same record in
> a different file.
> If each record is large, placing the two files on two different disks
> usually speeds things up because reading (or writing) a 
> record requires to
> read, in parallel, half the record from one disk and the 
> other half from the
> other disk.
> Performance also increases if your select happens to access 
> only the fields
> stored in one of the two files: if the select requires a 
> sequential scan of
> the entire (large) table, the DB manager will have to read 
> through only half
> the mount of data.
> In my opinion however table partitioning is beyond the scope 
> of a DB like
> SQLite...
> 
> Bye
> 
> 
> 
> 
>  -Messaggio originale-
> Da:   [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
> Inviato:  martedì 21 marzo 2006 14.48
> A:sqlite-users@sqlite.org
> Oggetto:  Re: [sqlite] support for table partitioning?
> 
> "Miha Vrhovnik" <[EMAIL PROTECTED]> wrote:
> > On 3/21/2006, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote:
> > 
> > >"Miha Vrhovnik" <[EMAIL PROTECTED]> wrote:
> > >> Hi drh and others,
> > >> 
> > >> Regarding the issues they appear on ML with very large tables and
> knowing
> > >> that sqlite now enforces constraint checks on tables, is 
> there any
> > >> chances of  suporting table partitoning?
> > >> 
> > >
> > >Put each table in a separate database file then ATTACH
> > >the databases.
> > >
> > That's not the same. You still need to rewrite queries, 
> where in real
> > table partitioning you don't need to do that.
> 
> What changes to the queries do you think are necessary?
> 
> > The select's and inserts
> > are faster because db knows where to put/search for them.
> > 
> 
> The database already knows exactly where to look for each
> table when all the tables are in the same file.  All it
> has to do is "lseek()" to the appropriate spot.  How does
> moving tables into separate files help this or make it any
> faster?
> 
> --
> D. Richard Hipp   <[EMAIL PROTECTED]>
>