Re: [sqlite] WAL mode is reliable?

2011-11-05 Thread Frank Missel
> boun...@sqlite.org] On Behalf Of Richard Hipp

> (2) WAL really does give better concurrency since it allows other
processes
> to continue reading while a single process is writing to the database.
In
> rollback mode you can have one writer *or* multiple readers.  In WAL mode
> you can have one writer *and* multiple readers.  On a busy system, or in a
> system with long-running transactions, the difference can be significant.
> 
> (3) Transactions commit a lot faster with WAL.  Depending on your app, you
> might see a measurable performance improvement just by switching to WAL.
> 
> (4) WAL mode is much more resistant to corruption following a power
failure
> that occurs soon after the disk drive lied and said that content was
> committed to oxide when in fact it was still in a volatile track buffer.
> And pretty much all consumer-grade disk drives tell such lies these days.

I just recently started to use WAL and had the following experience with
above:

I work with large databases (e.g. 40 Gb) and when loading data to these, I
will usually get my program to make copy of the database through the OS and
then during heavy data loads run the database with: 

PRAGMA journal_mode = OFF;
PRAGMA locking_mode = EXCLUSIVE;
PRAGMA synchronous = 0;
PRAGMA cache_size = 40;

 This increases the load spead tremendously. However, as the database
increases the time spend copying the database increases with it,  although
it is still much faster than using the default journal_mode (DELETE).
Also, the database is locked for even read access during the loads.

Reading about WAL (http://sqlite.org/wal.html) I decided to try it out
during data loads with:

PRAGMA journal_mode = WAL;  
PRAGMA locking_mode = NORMAL;  
PRAGMA synchronous = 1;  
PRAGMA cache_size = 40;

(Thus using the same size cache and now just the default settings for
locking_mode and synchronous).

The result was that the data loads were running at virtually the same speed
as when journal_mode was set to OFF! 

Also, the database can now be opened for reading by other processes and I no
longer have the hassle of waiting for OS copies to be done for every load,
so having had only positive results and no adverse effects I can really
recommend the write ahead logging. 


Best regards,

Frank

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Multiple databases in one DB connection. How to let BEGIN TRANSACTION..COMMIT apply to only one of them?

2011-10-26 Thread Frank Missel
Well, if the statements within your BEGIN TRANSACTION ... COMMIT only really
affects one database (hereafter called the main database) only that database
should be locked.
If you experience otherwise, it could be because:

1. The locking_mode for the databases has been set to EXCLUSIVE (see
http://sqlite.org/pragma.html#pragma_locking_mode)

2.  The statements executed against the main database may read from other
databases as part of the execution thus holding SHARED locks on the other
databases which will prevent writing to them. See an explanation of the
locking system in SQLite here (http://www.sqlite.org/lockingv3.html).
However, you should still be able to read from the other databases while the
transaction is being committed.

3. Inside the transaction against the main database you have overlooked some
statement that does in fact write to the other databases.

Perhaps you can supply some more details about what takes place inside the
transaction against the main database and what you are trying to do with the
other databases, if the above explanation doesn't solve your issue.


/Frank

> -Original Message-
> From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
> boun...@sqlite.org] On Behalf Of Simon L
> Sent: 26 October 2011 14:08
> To: sqlite-users@sqlite.org
> Subject: [sqlite] Multiple databases in one DB connection. How to let
BEGIN
> TRANSACTION..COMMIT apply to only one of them?
> 
> While  one of the databases is under a lengthy writing process,  I don't
want
> other databases to be locked by the BEGIN TRANSACTION...COMMIT lock.
> Is this possible? Thanks.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] EXT :Re: Concurrent readonly access to a large database.

2011-10-25 Thread Frank Missel
Hi Bo,

> boun...@sqlite.org] On Behalf Of Bo Peng
> > I wonder if it would be better on just having the data organized
> > before loading it, so that the records in each of the 5000 tables
> > would be contiguously stored. Of course, that also depends on how much
> > new data will be added to the tables.
> 
> I have been thinking about this too, but I do not know how to do it
> efficiently.
>
>3. Create 5000 files, insert records to them, and import the files to the
main database. This seems to be the best option although I need to pay
>attention to OS limit on opened files.
> 
> Any suggest is welcome.

A variation of option 3 could is to load the data to an SQLite In-Memory
database (see http://www.sqlite.org/inmemorydb.html) in chunks so that every
time the in-memory database is full you open the disk files consecutively
one at a time appending the data to them from the corresponding table in the
memory database. Afterwards the in-memory database is cleared (close the
connection or drop the tables) and the cycle starts over.
This way you will not have to worry about the number of open files in the OS
which could give complications as you write.

You could of course code this this yourself holding the same amount of the
original records in a memory structure directly without using SQLite.
However, I could imagine that it would be faster to implement (and less
error prone) just using the SQLite in-memory database.

Once all 5000 files had been fully written they would then be loaded to the
final SQLite database.

> > Is the primary key an integer so that it in fact is the rowid thus
> > saving some space?
> > (see http://sqlite.org/lang_createtable.html, the section: "ROWIDs and
> > the INTEGER PRIMARY KEY".)
> 
> This certainly worth trying, as long as the performance is acceptable.
> It would be good if I can tell sqlite that there is no duplicate and
missing
> values so it does not have to check.

Well, if you don't have any particular use of a primary key its probably
best to not define any so that no processing time is used on it.
Also, if you do define a column as "INTEGER PRIMARY KEY" it becomes an alias
for the rowid for that record and thus will not be a value that you supply.
The rowid in this case will be frozen for a given record and stay the same
even when a vacuum is done on the database.

> > If there are several calculations to be done for each table it would
> > perhaps be better to work with several threads concentrating on a
> > single table at a time. This would be particularly effective if your
> > cache_size was set large enough for a table to be contained entirely
> > in memory as the threads could share the same connection and (as I
> > understand it) memory cache (see
> http://www.sqlite.org/sharedcache.html).
> 
> I am running SELECT COUNT(*), MAX(QT) FROM TABLE_XX; etc as long as the
> WHERE clause is the same. I am also running queries in multiple threads
> which works well on SSD drive, but not on mechanical drive due to the
> bottleneck on random-access speed. I am not sure if we are talking about
the
> same cache but because the tables are spread sparsely across the database,
> caching more pages did not really help.

What I mean here is that if you have a number of different COUNT, MAX or
other statistical functions to be calculated for example with several
different WHERE clauses (e.g. 10, 20 or 30), it will pay off to focus on one
table at a time performing all the queries (whether consecutively or
concurrently using several threads) as the table will be loaded to the
memory cache the first time it is accessed. This is assuming that a table
contains about a million records and thus can be contained entirely in
memory.
This will work regardless of whether you have organized your database so
that records physically are stored contiguously. Of course, it is still best
to have the data stored contiguously as that will give a huge load
improvement.
If there is only a single SELECT COUNT and MAX statement to be done for the
entire 5000 tables it will not make any difference.

> > One might even consider a different strategy:
> > Do not use the statistical function of SQLite (count, average, max,
> > etc.) but access the records individually in your code. This requires
> > that data are loaded according to the primary key and that the threads
> > accessing them do so in a turn based fashion, using the modulo
> > function to decide which thread gets to handle the record, e.g.:
> 
> If all tables have the same set of item IDs, this can be a really good
idea. This
> is unfortunately not the case because each tables have a different set of
IDs,
> despite of 70% to 90% overlapping keys. I even do not know in advance all
> available IDs.

Okay, so each table will have to be handled by itself. 


Best regards,

Frank

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] EXT :Re: Concurrent readonly access to a large database.

2011-10-24 Thread Frank Missel
> boun...@sqlite.org] On Behalf Of Bo Peng
> 
> I will do this multiple times, with different conditions (e.g. SELECT
> MAX(c) FROM TABLE_X WHRE b > 1.0) so maintaining number of rows would
> not help. I intentionally avoided TRIGGERs because of the large amount
> (billions) of data inserted.
> 
> Other than using a SSD to speed up random access, I hope a VACUUM
> operation would copy tables one by one so content of the tables would not
> scatter around the whole database. If this is the case, disk caching
should
> work much better after VACUUM... fingers crossed.

Doing vacuum on a 288 Gb database is probably going to take some time. I
wonder if it would be better on just having the data organized before
loading it, so that the records in each of the 5000 tables would be
contiguously stored. Of course, that also depends on how much new data will
be added to the tables. 

Having worked with large databases and reading through this mail thread,
some questions and ideas come to mind:

How many new tables/records will be added per day/month?

Are records divided amongst the 5000 tables  based on time so that new
records will go into new tables rather than be inserted evenly among the
5000?

How many fields in the tables (I understand the 5000 tables are identical in
structure) 
What type of data is it?

Are there any indexes besides the primary key?
Unless your calculations are always or mostly on the same fields it is
probably best not to have any indexes.

Are there any redundancy in the data, e.g. character values which could be
normalized to separate tables using an integer key reference thus reducing
the size of the data carrying tables. Converting field contents to integer
or bigint wherever it is possible may give improvements in both size and
performance.

Is the primary key an integer so that it in fact is the rowid thus saving
some space?
(see http://sqlite.org/lang_createtable.html, the section: "ROWIDs and the
INTEGER PRIMARY KEY".)

If there are several calculations to be done for each table it would perhaps
be better to work with several threads concentrating on a single table at a
time. This would be particularly effective if your cache_size was set large
enough for a table to be contained entirely in memory as the threads could
share the same connection and (as I understand it) memory cache (see
http://www.sqlite.org/sharedcache.html) . Hereby the CPU and memory can be
put to good use rather than working with several independent connections
which do not utilize CPU and memory efficiently but just push the disk heads
around.

One might even consider a different strategy:
Do not use the statistical function of SQLite (count, average, max, etc.)
but access the records individually in your code. This requires that data
are loaded according to the primary key and that the threads accessing them
do so in a turn based fashion, using the modulo function to decide which
thread gets to handle the record, e.g.:

KeyThread
1001
1022
1033
1044
1051
1062
Etc.

(Sorry if the explanation is too explicit :-).

Thus you can have an army of threads using CPU and memory most efficiently
as the threads will be handling the same pages of the database which will be
cached in memory. Again, this requires that the records are physically
stored according to the key.
There is perhaps some overhead occurred by accessing the individual records
rather than letting SQLite do the count, max, etc. However, if there are
several calculations to be done on the same data, it may prove more
efficient this way. 

You could even consider having less tables so that the threads can work for
longer without having to switch to the next table. 


/Frank Missel

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Limit COUNT

2011-10-18 Thread Frank Missel
> boun...@sqlite.org] On Behalf Of Bart Smissaert
> 
> That seems to be the answer and after some quick testing it looks it makes
it
> more efficient as well!
> 
> On Tue, Oct 18, 2011 at 1:36 PM, Kit <kit.sa...@gmail.com> wrote:
> > 2011/10/16 Fabian <fabianpi...@gmail.com>:
> >> How can you limit a count-query? I tried:
> >> SELECT COUNT(*) FROM table LIMIT 5000
> >
> > SELECT min(COUNT(*),5000) FROM table;
> > --

My first impression was that the count(*) inside the min() would access all
the records anyway (perhaps not all the columns though) and thus in fact
still access more than 5000 records, even if it was hidden by the min
function.

To test this I took a database containing tables with millions of records
and executed the following statements:

1. select count(*) from table;

2. select min(count(*), 5000) from table;

3. select count(*) from (select null from table limit 5000);

Now to really test properly the computer would have to be restarted before
each test in order to avoid buffering of the disks, etc.
However, I could not be bothered in this case and just used three separate
tables with identical structure but a different number of records (between 4
and 5 million). As expected with the difference in execution times between
the three statements it did not matter that much.

1. Around 150 seconds

2. Around 14 seconds

3. Around 50 milliseconds !!!

Thus it seems that although variation two perhaps avoids fully reading the
records, it still have to touch them (or perhaps the primary key) which is
better than a normal full select count(*) on the table. Perhaps somebody
know the inner workings better and can explain the difference of factor 10
between variation 1 and 2.

To actually limit the count statement only variation 3 actually works (which
has been suggested earlier in the thread) and is of course several orders of
magnitude faster.

Somebody had also suggested: 
select count(*) from (select 1 from table limit 5000);

Notice the inner select of 1 instead of null. 
As predicted this is slightly slower at around 60 milliseconds. 

So selecting null is always better in sub queries when only the count or
existence of records are needed and not actual values from any of the
columns.


/Frank Missel

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Changing the data type of an existing table

2011-10-18 Thread Frank Missel
Hi Marshall,

A good suggestion.

The reason I chose to "transform" the schema instead was that there are a
number of databases with similar structure on different sites which has the
same declared data types. Also there will be several more in the future.
This is supported by a framework of some classes that I use to handle
SQLite. The frame work has been updated so only those declared data types
that work with Excel (i.e. Excel recognizes the data as the correct type and
use sum and other numerical functions). Thus future databases would have
only the "correct" declared data types. 

Rather than have some databases which need special handling with views I
prefer to handle the situation once and for all in existing and future
databases and not have to worry about having special views for some tables
in some databases at certain sites.

/Frank

> -Original Message-
> From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
> boun...@sqlite.org] On Behalf Of Marshall Cline
> Sent: 18 October 2011 22:30
> To: General Discussion of SQLite Database
> Cc: General Discussion of SQLite Database
> Subject: Re: [sqlite] Changing the data type of an existing table
> 
> I think you've already found a good solution to your problem by directly
> modifying the master table, so this is mostly an academic curiosity, but
would
> it be a viable alternative to use a VIEW? In other words, rather than
messing
> with the TABLE that holds the data, could you simply create a VIEW that
> exposes the information in the type you want (via a cast?), then your
excel
> can access the data via the VIEW rather than the TABLE?
> 
> To reiterate, I'm merely asking about viability here.
> 
> Marshall
> 
> Sent from my iPhone
> 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FW: How about a proper forum rather than an e-mail list

2011-10-18 Thread Frank Missel
Hi Mark,

I started this thread with a  suggestion of having a forum for SQLite rather
than a mailing list. That is the subject line.
All posts have been related to that. I wished to see what the consent or
lack thereof was in the community. Possibly it would then also be noted by
those that presently run the SQLite mailing list.
So how is that off topic exactly? 
Those that don't have anything to add to it, don't have to bother with it.

If you had started another thread about some technical aspect of SQLite and
I then started mailing to that thread inserting irrelevant posts about a
forum I would agree with you. But that is not quite the case is it. 
Your reply come over as somehow offended and I don't really see why.

Possibly you may feel that these post take up space in the general SQLite
mailing list, but that is one point of having a forum, that one can easily
overview and follow the threads which are of interest -- although I did not
start the thread to demonstrate it in that way :-).

I appreciate your suggestion about starting a forum myself. I do feel,
however, as stated earlier that it is not my business to take over the
running of the SQLite website, mailing list, forum, etc. That would belong
with the organisation, company, persons that take care of that now. By
having this thread with people replies and opinions it can then give an idea
of the pro and cons as well as an idea of how would be for and against it.


Best regards,

Frank


> -Original Message-
> From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
> boun...@sqlite.org] On Behalf Of Mark Schonewille
> Sent: 18 October 2011 22:19
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] FW: How about a proper forum rather than an e-mail
list
> 
> Hi Frank,
> 
> If you want to discuss the pros and cons of a forum, I think you should
ask
> your question on a mailing list or forum about mailing lists and forums.
Your
> question is very off-topic if you don't intend to start a SQLite forum
yourself,
> but I think you could ask the members of this SQLite mailing list "who
wants
> to start a forum?" without starting a lengthy off-topic discussion. I'm
out of
> this discussion now.
> 
> --
> Best regards,
> 
> Mark Schonewille
> 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FW: How about a proper forum rather than an e-mail list

2011-10-18 Thread Frank Missel
Hi Mark,

Well, perhaps you are right. 

But I am not going to take on that. 
I am a user of SQLite and can perhaps help with some posts and suggestions
now and then.
So I am just going to throw in the idea and state the pros vs. cons as I see
them .

Best regards,

Frank


> -Original Message-
> From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
> boun...@sqlite.org] On Behalf Of Mark Schonewille
> Sent: 18 October 2011 21:54
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] FW: How about a proper forum rather than an e-mail
list
> 
> Hi,
> 
> I know from experience that there really is no point whatsoever in
discussing
> whether a forum is better than mail or not. People are just not going to
> agree. Just set up your forum and find out whether you're going to have
> subscribers.
> 
> --
> Best regards,
> 
> Mark Schonewille
> 
> Economy-x-Talk Consulting and Software Engineering
> Homepage: http://economy-x-talk.com
> Twitter: http://twitter.com/xtalkprogrammer
> KvK: 50277553
> 
> See what you get with only a small contribution. All our LiveCode
downloads
> are listed at http://qery.us/zr
> 
> On 18 okt 2011, at 15:49, Frank Missel wrote:
> 
> >> boun...@sqlite.org] On Behalf Of Jean-Denis MUYS
> >>
> >>> 1. Several subject forums as mentioned
> >>>
> >> Mail can have as many subjects as desired
> >
> > Well, so can forum posts. The point is that the forum is divided into
> > main categories above the subject of the post
> >
> >
> >>> 2. Better view of threads with several levels being immediately
> >>> displayed
> >>>
> >> My mail client threads far better than most forums (fora?)
> >
> > I seriously doubt that :-).
> > I mean how much better than a totally ordered hierarchy with several
> > levels and forks can it be.
> >
> > But perhaps your mail client is very good. Which one do you use?
> >
> > Also, in a forum, you can see posts from a selected author sorted by
date.
> >
> >
> >>> 3. Preview of entries and  editing of them even after they are
> >>> posted (by the author)
> >>
> >> Which can be seen as a liability
> >
> > To be sure.
> > But then again until anyone has posted a reply. It can be used to
> > correct a typo or add info (where it makes sense rather than add a new
> post).
> >
> >
> >>> 4. Formatted rather than plain text
> >>
> >> Which mail is capable of
> >
> > Any formatting is stripped. At least it is so on the sqlite-users
mail-list.
> >
> >
> >>> 5. No need for e-mail-addresses to be exposed
> >>>
> >> Couldn't a mailing list hide email addresses too?
> >
> > Perhaps it could.
> > I find, however, that e-mails are spread all over the place on the
> > sqlite-users mailing list. So it is not being done here.
> > That is one of my big objections as it invariable leads to spamming of
> > the members.
> >
> >
> >> On the other hand, I find mailing lists much better: I can read them
> >> off
> > line, I
> >> can also answer them off line (my client will send my answers as soon
> >> as
> > it
> >> gets online), I can archive any and/or all posts that I find valuable.
> >
> > I get messages from others when they are on their way to work or home
> > sitting in trains and busses.
> > Mostly people are just online through wireless or mobile networks. It
> > is considered "in" to be able to access the Internet at all times in
> > all places, so I find that point a bit moot nowadays.
> >
> >
> >> In fact, when fora (forums?) propose a mailing list interface (i.e.
> >> google groups), I prefer subscribing to them as a mailing list.
> >>
> >> Even better than mailing lists: newsgroups. Except my company only
> >> lets us use port 80 in addition to the mail gateway, so I can't use
> newsgroups.
> >
> > Who am I tell someone what to prefer.
> > I just find, however, it is a bit religious like choice of OS or
> > gadgets. I can see that e-mail lists can work and in the past could
> > have a lot of advantages. Nowadays, I find not as many good reasons to
> > prefer them over a proper forum which have some really nice features.
> >
> >
> > /Frank Missel
> 
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FW: How about a proper forum rather than an e-mail list

2011-10-18 Thread Frank Missel
> boun...@sqlite.org] On Behalf Of Eugene N
> Mailing list is the one true way; Look what IT forums turn into
eventually...

This is a generality and not really an argument.
Which forums do you mean?

I have followed several forums which work exactly as one would hope for. Two
examples comes to mind: 
https://forums.virtualbox.org/
http://forums.codeblocks.org/


> The only way to maintain high level of responsibility and seriousness of
> discussion is, alas, a mailing list

I don't see why. The two examples above are very good forums with relevant
posts, etc.
I find that SQLite would fall in the same category: A technical expertise
forum covering a specific product, application, programming language, etc.
With a members only access and very dedicated users.


/Frank Missel

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FW: How about a proper forum rather than an e-mail list

2011-10-18 Thread Frank Missel
> boun...@sqlite.org] On Behalf Of Jean-Denis MUYS
> 
> > 1. Several subject forums as mentioned
> >
> Mail can have as many subjects as desired

Well, so can forum posts. The point is that the forum is divided into main
categories above the subject of the post 

 
> > 2. Better view of threads with several levels being immediately
> > displayed
> >
> My mail client threads far better than most forums (fora?)

I seriously doubt that :-).
I mean how much better than a totally ordered hierarchy with several levels
and forks can it be. 

But perhaps your mail client is very good. Which one do you use?

Also, in a forum, you can see posts from a selected author sorted by date.


> > 3. Preview of entries and  editing of them even after they are posted
> > (by the author)
> 
> Which can be seen as a liability

To be sure. 
But then again until anyone has posted a reply. It can be used to correct a
typo or add info (where it makes sense rather than add a new post).


> > 4. Formatted rather than plain text
> 
> Which mail is capable of

Any formatting is stripped. At least it is so on the sqlite-users mail-list.


> > 5. No need for e-mail-addresses to be exposed
> >
> Couldn't a mailing list hide email addresses too?

Perhaps it could. 
I find, however, that e-mails are spread all over the place on the
sqlite-users mailing list. So it is not being done here.
That is one of my big objections as it invariable leads to spamming of the
members.


> On the other hand, I find mailing lists much better: I can read them off
line, I
> can also answer them off line (my client will send my answers as soon as
it
> gets online), I can archive any and/or all posts that I find valuable.

I get messages from others when they are on their way to work or home
sitting in trains and busses.
Mostly people are just online through wireless or mobile networks. It is
considered "in" to be able to access the Internet at all times in all
places, so I find that point a bit moot nowadays.


> In fact, when fora (forums?) propose a mailing list interface (i.e. google
> groups), I prefer subscribing to them as a mailing list.
> 
> Even better than mailing lists: newsgroups. Except my company only lets us
> use port 80 in addition to the mail gateway, so I can't use newsgroups.

Who am I tell someone what to prefer.  
I just find, however, it is a bit religious like choice of OS or gadgets. I
can see that e-mail lists can work and in the past could have a lot of
advantages. Nowadays, I find not as many good reasons to prefer them over a
proper forum which have some really nice features.


/Frank Missel

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How about a proper forum rather than an e-mail list

2011-10-18 Thread Frank Missel
> boun...@sqlite.org] On Behalf Of Igor Tandetnik
> Sent: 18 October 2011 20:05
> For what it's worth, I'm using GMane (http://gmane.org/), which is a
mailing
> list-to-NNTP gateway and happens to carry this list. I'm old-fashioned
enough
> to believe that an NNTP newsgroup *is* the proper forum. Can't stand
> modern Web-based forum interfaces.

I agree.
If not for http://www.mail-archive.com/sqlite-users@sqlite.org/

It would be quite bad. 
Using that gives the list at least a flavour of forum with some overview of
the threads.

But I can see where this is leading :-).  Many seem quite happy with the
list.


/Frank

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How about a proper forum rather than an e-mail list

2011-10-18 Thread Frank Missel
Hi Teg,

> I love forums and consider them far superior to email if only because it's
> easier to follow a topic with less quoting needed. The downside is that
> someone  has  to  manage the forum. I've managed a forum for the past  10
> years  and there's a daily spam cleanup process and constant attacks and
> required upgrades.  You have to set the tone and be pretty ruthless about
> flaming too.
> 
> I'd like to see a forum. I just wouldn't want to manage it.

Okay, but if the posting is by members only would it not be the same as the
e-mail-lists. 
I don't see a lot of spam in the e-mail-list, so either it also monitored by
someone or the fact that it can only be accessed by members makes for a well
behaved list which would be the same for the forum.

/Frank

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] How about a proper forum rather than an e-mail list

2011-10-18 Thread Frank Missel
I think that the sqlite-users e-mail list has enough traffic to warrant a
proper forum. 

Has this been considered?

 

A proper forum also can contain several subject forums, e.g.

USER FORUMS:

Announcements

General

Help with SQL

News

DEVELOPERS CORNER:

News

General

OS specific

. 

 

There is a quite widespread, free Forum software that could be used:

http://www.simplemachines.org/

http://en.wikipedia.org/wiki/Simple_Machines_Forum

 

They support MySQL, SQLite and PostgreSQL as the underlying database.

 

Well, just a thought.

 

 

/Frank Missel

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FW: How about a proper forum rather than an e-mail list

2011-10-18 Thread Frank Missel
> [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Mark Schonewille
> I prefer mailing lists because I can read and write e-mails while I'm
online
> (.e.g while travelling) and I can easily flag important messages, copy
text and
> syntax in plain format or print (part of) an e-mail and I can also
organise the
> message in the wy I see fit. A true geek uses e-mail, no forums.

Well, if you are online you can open a browser as well.

You can copy text and print -- it is after all also just text in the posts
in the forum when you use ctrl-c.
When working posts in forum software there are very nice features for
commenting posts of others where author and date is automatically inserted.

The one thing that I find really nice in forums is the immediate overview of
longer threads with forks on several levels. Its easy to follow the
discussions.

I have tried both forum and e-mail-list. Although e-mail-list can work, the
forum is so much nicer in my opinion.

> A true geek uses e-mail, no forums.

Ah well, perhaps this is more the reason for not wanting to try it ;-).


> However, if you really want a forum, install one on your web server and
find
> out how many people use it.

Nah, for it to have any success it would have to be an official forum
endorsed and supported by the SQLite team.


/Frank Missel

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FW: How about a proper forum rather than an e-mail list

2011-10-18 Thread Frank Missel
I see several advantages to having a forum:

1. Several subject forums as mentioned

2. Better view of threads with several levels being immediately displayed

3. Preview of entries and  editing of them even after they are posted (by
the author)

4. Formatted rather than plain text

5. No need for e-mail-addresses to be exposed


/Frank Missel

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] FW: How about a proper forum rather than an e-mail list

2011-10-18 Thread Frank Missel
I think that the sqlite-users e-mail list has enough traffic to warrant a
proper forum. 

Has this been considered?

 

A proper forum also can contain several subject forums, e.g.

USER FORUMS:

Announcements

General

Help with SQL

News

DEVELOPERS CORNER:

News

General

OS specific

. 

 

There is a quite widespread, free Forum software that could be used:

http://www.simplemachines.org/

http://en.wikipedia.org/wiki/Simple_Machines_Forum

 

They support MySQL, SQLite and PostgreSQL as the underlying database.

 

Well, just a thought.

 

 

/Frank Missel

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Limit COUNT

2011-10-17 Thread Frank Missel
> boun...@sqlite.org] On Behalf Of Fabian
> Sent: 17 October 2011 15:34
> >
> No, I only want to have a capped total available.
> 
> If I would go with Simons solution, I have to read the rows for the first
> 100 pages (or whatever the cap is) into a temporary table, just to show
the
> first page. I don't need a cache for all those other pages, so that seems
a lot
> of overhead. I only want to know if there are 100 or less pages (without
> copying data around).
> 
> Maybe COUNT() is also creating a temporary table behind the scenes, then
> the performance of Simons solutions would be comparable with what I have
> now, and I would have the advantage that I can re-use that table to show
> subsequent pages without reading from disk.
> 
> But I always assumed COUNT() was faster than copying between tables,
> maybe I should just benchmark it.

I had the idea that you just retrieved the first 100 records and not the
first 100 pages.
Could the user not just see the first 100 records and perhaps an indicator
if there were more or not. He could then get 100 records at a time browsing
through them or if he so wished get a record count (you would then use the
count function on all records).

If you need to know up front whether there are more than 5000 records or not
I suggest you use the suggestion from Petite Abeille:

select count( * )
from   (
 select 1   or even just "select null" which
will not fetch anything.
 fromtable
 limit 5000
   )

You could do that and also just select e.g. 50 actual rows and display these
to the user.


/Frank


 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Limit COUNT

2011-10-16 Thread Frank Missel
> boun...@sqlite.org] On Behalf Of Simon Slavin
> Sent: 16 October 2011 21:53
> 
> Perhaps he could read the rows LIMIT 100 into a buffer (after all, he's
going
> to need them eventually for when he displays them), then count how many
> rows he got.

Yeah, I would go that way also.

But it sounds a bit like Fabian both wants to have the total number of
records available and at the same time limit the count.
It is just not possible to do both at the same time as far as I can see.

Thus, you are left with following options as far as I can see:

1. As Slavin mentions: Read a certain number of records, e.g. 101 and change
your code so that the user can ask for e.g. 100 additional records at a
time. Possibly you could have an extra function that the user could activate
to have a total count if he so wishes. This would then read all records and
you would not have to worry about performance as all records (unless there
are very many) will be in the memory cache and subsequent request for these
records should be fast.

2. If there are many concurrent users and the query requests from these are
quite predictable, e.g. straight reads of all records in certain tables, you
could perhaps  have a local background job regularly count the number of
records for the relevant tables/queries and store these count results  in a
special status table. Your client program could then read the number of
records from the special status table as well as the first 101 records from
the actual data carrying table. The user could then get the first 100
records, and continue with batches of 100 as needed.

If these options does not seem to fit, I think that you have to explain a
bit more about how your solution is,  what you expect and how you access the
database.


/Frank


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Limit COUNT

2011-10-16 Thread Frank Missel
Hi Fabian,

The problem is that the limit apparently is applied on the result set, and
in this case the result set is only one row which is less than the 5000.
The culprit is the count(*) which basically says that to get the first row
in the result set all rows from the table has to be processed.

You could instead try:

select count(*)
from (select * from table limit 500)
;

That would give you the number 500. But it is kind of meaningless.
You might as well write:

select *
from table 
limit 500
;

And get the actual 500 first rows. In this case the result set is build up
as the select statement processing loops through the records in the table,
and thus it only needs to take 500 rows. Again, if you add a count(*) it
needs to traverse all records to get the first and only row  of the result
set that states the number of records in the table. It is the same if you
specify ORDER BY together with limit. You will only get the x number of
records in the LIMIT statement but before that all records will have to be
traversed and sorted before you the 500 rows, unless perhaps there already
is an index with one component fitting the ORDER BY perfectly.

What do you want to attain with the count?

/Frank

> -Original Message-
> From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
> boun...@sqlite.org] On Behalf Of Fabian
> Sent: 16 October 2011 19:09
> To: General Discussion of SQLite Database
> Subject: [sqlite] Limit COUNT
> 
> How can you limit a count-query? I tried:
> 
> SELECT COUNT(*) FROM table LIMIT 5000
> 
> But it ignores the LIMIT clause. I think the workaround would be counting
the
> results of a sub-query, but I'm trying to understand whats wrong with the
> syntax above. The goal is to make the engine stop iterating after it
reached
> the upper limit, in order to save performance.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Changing the data type of an existing table

2011-10-16 Thread Frank Missel
Hi Dan,

> The two values in table "t" are stored in integer form. Were you to
magically
> change the database schema without rebuilding the underlying
> b-trees:
> 
>CREATE TABLE t(a TEXT);
> 
> and then execute the same SELECT, it would not work. SQLite would search
> the index for text value '1', not integer value 1 (since it assumes that
all
> values had the TEXT affinity applied to them when they were inserted). And
> the query would return no rows.

You are right. However, in my case it is a bit special in that the values in
the fields actually already are stored correctly as REAL numbers but because
the declared data type is DECIMAL (which is used in some databases), Excel
will not treat the real numbers as such. Changing the declared data type
does the trick!

As for text to integer as in your example, I guess it could also be done.
First the declared data type could be changed as I have described. After
that an update statement could set the  = CAST( as INTEGER).
But you are right that indexes in this case would have to be rebuilt with
the REINDEX statement. However, that might still be better than having to
recreate the whole table and build indexes.


/Frank

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Changing the data type of an existing table

2011-10-16 Thread Frank Missel
> -Original Message-
> From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
> boun...@sqlite.org] On Behalf Of Frank Missel
> Sent: 16 October 2011 15:37
> 
> With this open philosophy of the SQLite database I thought that perhaps
> there really isn't any compelling reason to disallow changing the declared
> data type of an existing column since the actual stored data of a column
> might be of any type - again, according to the basic philosophy of SQLite.
> Thus changing a declared data type of an existing column should not break
> anything between the schema and the stored data. But I am not sure if it
is
> even possible. If not, I will move my many Gigabytes of data around, but I
> thought, it would be worth just checking first.

Well, I looked a bit more into it and found that indeed the declared data
type of a column in an existing table can be changed without breaking
anything. In the example below the data type "DECIMAL" is changed to "REAL"
(which will make the real numbers transferred to Excel via the ODBC driver
behave as such):

1. Backup the database

2. SQL statement: 
pragma writable_schema = 1;

3. SQL statement: 
update sqlite_master
set sql = replace(sql, 'DECIMAL', 'REAL   ')
where type = 'table'
and sql like '%decimal%'
;

4. 2. SQL statement: 
pragma writable_schema = 0;

Notice the three spaces after REAL. They are not really needed but in case
there are extra field attributes after the data type in the CREATE TABLE
statement and the attributes are column aligned, the alignment will be
retained after the change.

I guess other aspects of the tables can also be changed, e.g. renaming
columns -- but here one has to remember any indexes, constraints, etc. So
that is a bit more daring but probably doable. For dropping columns one
needs to copy the data to a new table as far as I can see.

I take a bow: Hurrah for SQLite. This is truly magnificent!

WARNING:
Be sure make a backup copy of the database before doing anything as changing
the schema can corrupt it.
Any tinkering with the sqlite_master table is done at your own risk. Don't
blame me if anything goes wrong :-).

Also, I tried out the above on a small test database of the same structure
as a production database and after changing contents of sqlite_master I
tested various statements against the database to see that everything worked
as it should.

I will wait a while before changing my production databases, just in case
someone might see anything wrong with the above.


/Frank Missel


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Changing the data type of an existing table

2011-10-16 Thread Frank Missel
Unfortunately, a need has arisen to change the data type of columns in
existing tables in some quite large databases. This is due to the fact that
Excel does not take kindly to data received from the SQLite ODBC driver
unless they are of a certain data type (the declared name), i.e. the data
are not perceived as the right type and so e.g. numerical summary functions
will not work.

 

The issue is covered by this thread:

http://www.mail-archive.com/sqlite-users@sqlite.org/msg64714.html

 

This is the reason why I would want to change the data type of existing
columns. Otherwise, I guess it would not matter that much as the types of
SQLite are bound to the actual data values and not the columns in the schema
which only have a data type affinity. 

 

Instead of changing the data type I could of course just enclose all columns
in CAST statements when selecting data to Excel. However, I prefer to only
have to do that when including expressions. Also, I have updated the frame
work that I use to handle SQLite databases so that any future tables will
only declare data types that will work correctly towards Excel. Thus I now
wish to make sure that all existing databases conform with these data types.

 

I was just about to write some code to browse through a data base, find all
tables with columns declared as a certain data type  to be changed and then
move the data over to a new identical table (but defined with the new data
type for the relevant columns). However, before going through this exercise
I thought I would just ask if any one knew of a better way to do this, as
there are several quite large databases to be so handled.

 

When I first learned about SQLite I never like the relaxed handling of data
where column data types from the schema are not enforced strictly on the
actual data in the tables. This felt instinctively wrong as it is very
contrary to other databases and many programming languages that I have
worked with. However, I have come to appreciate SQLite as a powerful data
container that you can use to handle data in various ways according to need.
With e.g. check restraints one can enforce data types on columns at a basic
level. On the other hand, it is sometimes more useful to just let data enter
the database and then later on use data validation to the extent that it is
needed.

 

With this open philosophy of the SQLite database I thought that perhaps
there really isn't any compelling reason to disallow changing the declared
data type of an existing column since the actual stored data of a column
might be of any type - again, according to the basic philosophy of SQLite.
Thus changing a declared data type of an existing column should not break
anything between the schema and the stored data. But I am not sure if it is
even possible. If not, I will move my many Gigabytes of data around, but I
thought, it would be worth just checking first.

 

 

/Frank Missel

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to use SQLite as a data source in Excel (tables and pivottables)

2011-10-15 Thread Frank Missel
Hi Alek,

I did get it to work without the hassle of creating DSNs for every database but 
instead using the generic DSN for SQLite by recording a VBA macro and then 
substituting the name of DSN reference from the one mapped to a particular 
database to the name  "SQLite3 Datasource" system DSN (from the ODBC driver) 
and changing the SQL code so it matches the actual database.
However, it calls the Connections.Add method with arrays and thus doesn't seem 
as simple as your code which has a better structure and thus will be easier to 
adapt, so I will use that as the basis for my external code calling the Excel 
COM objects.

Thanks a lot, this is very helpful.


Best regards,

Frank

> -Original Message-
> From: Alek Paunov [mailto:a...@declera.com]
> Sent: 15 October 2011 00:13
> To: General Discussion of SQLite Database
> Cc: Frank Missel
> Subject: Re: [sqlite] How to use SQLite as a data source in Excel (tables and
> pivottables)
> 
> Hi Frank,
> 
> You can take a look at my start-up file for excel 2000/2003:
> 
> http://source.declera.com/excel/personal.xls
> (I am also attaching contained VBA module db.bas)
> 
> With started personal.xls [1], one can open empty sheet named "sqlite",
> enter the path to sqlite database file in cell A1, and then use the following
> (defined in personal.xls) shortcuts:
> 
> * Ctrl-D: pressed (for example) in A6: Executes SQL in B6 storing results in
> new worksheet named as A6
> 
> * Ctrl-Shift-D: in A3, same as above, but looks for SQL in A4 and stores the
> result starting from A5
> 
> * Ctrl-T: like Ctrl-D but opens Pivot table instead of Query Table
> 
> You can see these in following example:
> http://source.declera.com/excel/packages.xls
> 
> The example workbook uses this database (part of the Fedora packaging
> system yum):
> http://source.declera.com/excel/packages.zip
> 
> As you will see (in the VBA code), this VBA glue lies on the ODBC driver for
> sqlite (Thank you Mr.Werner!), but can be used with OLEDB sources too.
> 
> Cheers,
> Alek
> 
> [1] Excel personal.xls from the location specified in:
>  Tools/Options/General/At startup, open all files in
> 
> 
> On 13.10.2011 22:10, Frank Missel wrote:
> > Hi Bart,
> >
> >> boun...@sqlite.org] On Behalf Of Bart Smissaert No, the wrapper is
> >> not used that way and I don't think it can be used that way.
> >> The SQLite database is dealt with in VBA or VB6 code via this wrapper.
> >> I suppose you could compare it to using ADO with a DSN-less connection.
> >>
> >> RBS
> >
> > Okay, that's what I thought.
> > I did try the wrapper a couple of years ago and found it very well
> > designed and performing; I can also recommend it for VB 6, VBA or
> VBScript.
> > I now use the C API directly -- sort of my own wrapper for some
> > special purposes.
> >
> > So when referencing an SQLite database from Excel you also use the
> > ODBC driver I guess.
> > This brings me to my main remaining issue which is to avoid having to
> > create an individual data source for each SQLite database.
> >
> > If anyone have any solution for this or any other, easier alternative
> > way of accessing an SQLite database as a data source programmatically
> > through the Excel COM object model (in order to e.g. create a
> > Pivottable), I would be very eager to hear about it :-).
> >
> >
> > /Frank
> >
> > ___
> > sqlite-users mailing list
> > sqlite-users@sqlite.org
> > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> >


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to use SQLite as a data source in Excel (tables and pivottables)

2011-10-13 Thread Frank Missel
> -Original Message-
> From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
> boun...@sqlite.org] On Behalf Of Petite Abeille
> Sent: 14 October 2011 03:58
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] How to use SQLite as a data source in Excel (tables
and
> pivottables)
> 
> Pivot tables can be populated from another worksheet, a data source, an
> OLAP cube, etc...

Yes, indeed.

> That number is pretty high these days, but yes there is no point using an
> intermediary worksheet. Get the data directly from source. Also, you may
> want to pre-process, e.g. summarize, your data in the database already as
> much as you can before hand. Excel is not a speed daemon when confronted
> with a truck load of data.
> 
> You can also create and save the pivot table as an offline OLAP cube, with
> hierarchical dimensions & all (Microsoft Query + OLAP Cube Wizard).

Row max is slightly more than a million rows now which is a lot better than
in Excel 2003. 
However, the data sets that I use often go above that number of rows.

Pre-processing is an option that I sometimes use. But  it then freezes the
representation of the data in that form. The ability to change the view of
the data in Pivottables is a very nice feature.

I find that Excel has decent performance in Pivottable calculations,
especially so with the newer versions.

I might look into OLAP cubes, although I find it adds yet another
computational layer. 
Probably the best option for now is to stick with the ODBC driver and just
accept the hassle of creating a new DSN for every individual database which
requires some registry manipulation which is a bit messy but can be done.

I am still dreaming and hoping  that someone might have the perfect solution
for just using the SQLite database as a proper OLE DB data source which
seems to be the way to do it nowadays or using the generic ODBC driver as
the data source without having the create an individual DSN.


/Frank

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to use SQLite as a data source in Excel (tables and pivottables)

2011-10-13 Thread Frank Missel
> boun...@sqlite.org] On Behalf Of Bart Smissaert
> Sent: 14 October 2011 04:05
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] How to use SQLite as a data source in Excel (tables
and
> pivottables)
> 
> It looks you can't make a pivot table directly from an array.
> What you could do though is write the array to a text file and base the
array
> on that file as an external data source via a text driver.
> Another option is build your pivot table in code, not using the Excel
pivot
> table object.
> 
> RBS

I do sometimes use the option of building the Pivottable through usage of
the SELECT  and GROUP BY. However, it is then frozen in that form. The nice
thing about Pivottables is that the row and column fields can be changed on
the spur to get a new view of the data.

As for text files I find that they introduce yet another layer. The data is
already coming from somewhere else then stored in SQLite, and now they then
have to go to a text file to then be imported to Excel. Also, this may
create new challenges with the data types being recognized correctly, and
more importantly: I could not find a way to programmatically get Excel to
take a text file as basis of a Pivottable. Sure, doing it manually is no
problem at all but back with Excel 2003, I and some others tried to get it
done through Automation (Excels COM object model) -- it simple could not be
done (I almost suspect this was so by design from MS). Perhaps it is
different in Excel 2007 / 2010, but I could imagine not.

All in all it would be nice to just use the SQLite database as a proper data
source like you can with Oracle, SQL server and a number of other databases
/ data sources. But I guess the best for now will be to just accept creating
a DSN for each individual database and use the ODBC driver. Its a bit messy
programmatically as you have to access the Registry but it can be done.

/Frank

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to use SQLite as a data source in Excel (tables and pivottables)

2011-10-13 Thread Frank Missel
Hi Bart,

> Interesting. Why, if you had a well performing VB wrapper, did you go this
> route?

1. Implementing the wrapper in the project code would also take some coding,
and I found that using the C API would not be that much extra work. Thus I
could save a layer, which was good as the project had several other layers
already.

2. The project required heavy data loads. I thought that I could get better
performance and control of data validation, i.e. I could decide exactly how
much and what to have.

3. It was a bit fascinating to get to work close to engine -- minimalistic
and effective is always fascinating :-).

> >  If anyone have any solution for this or any other, easier alternative
> > way of
> accessing an SQLite database as a data source programmatically
> 
> This is exactly what I do and no problem at all for example to produce a
pivot
> table based on data from SQLite.

Interesting, how do you get the data from the table or view into Excel to be
the basis of the Pivottable?
Do you paste it to a worksheet (perhaps as arrays) that then becomes the
basis of the Pivottable?

My problem is that the data basis of the Pivottable will sometimes be
millions of rows, i.e. many more than can be contained in a worksheet. 
But when referencing the data source directly as a proper data source the
number of rows are not limited to the maximum number of allowed rows in a
worksheet.

/Frank

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to use SQLite as a data source in Excel (tables and pivottables)

2011-10-13 Thread Frank Missel
Hi Bart,

> boun...@sqlite.org] On Behalf Of Bart Smissaert
> No, the wrapper is not used that way and I don't think it can be used that
> way.
> The SQLite database is dealt with in VBA or VB6 code via this wrapper.
> I suppose you could compare it to using ADO with a DSN-less connection.
> 
> RBS

Okay, that's what I thought. 
I did try the wrapper a couple of years ago and found it very well designed
and performing; I can also recommend it for VB 6, VBA or VBScript.
I now use the C API directly -- sort of my own wrapper for some special
purposes.

So when referencing an SQLite database from Excel you also use the ODBC
driver I guess.
This brings me to my main remaining issue which is to avoid having to create
an individual data source for each SQLite database.

If anyone have any solution for this or any other, easier alternative way of
accessing an SQLite database as a data source programmatically through the
Excel COM object model (in order to e.g. create a Pivottable), I would be
very eager to hear about it :-).
 

/Frank

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to use SQLite as a data source in Excel (tables and pivottables)

2011-10-13 Thread Frank Missel
Hi Bart,

Okay, but I thought that the wrapper was just for working with the SQLite
database and then later when you wanted to use the database as a data source
that you would then still use the ODBC driver.

But are you saying that you are use the wrapper itself as a data source in
Excel? 
And if so, how do you specify the wrapper as a data source?


/Frank


> -Original Message-
> From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
> boun...@sqlite.org] On Behalf Of Bart Smissaert
> Sent: 14 October 2011 02:21
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] How to use SQLite as a data source in Excel (tables
and
> pivottables)
> 
> Hi Frank,
> 
> This VB wrapper is not an ODBC driver, so there is no DSN.
> The database file is set in the connection string:
> 
> Function OpenDB([FileName As String],
>[EncrKey As String],
>[EnableVBFunctions As Boolean = True]) As
Boolean
> 
> Member of vbRichClient4.cConnection
> 
> 
> RBS
> 
> 
> On Thu, Oct 13, 2011 at 7:11 PM, Frank Missel <i...@missel.sg> wrote:
> > Bart, thanks for the offer, but we found the cause of the problems in
> > the data type naming of the columns.
> >
> > Have you found a way to avoid having to define DSN's for each
> > individual database?
> >
> > /Frank
> >

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to use SQLite as a data source in Excel (tables and pivottables)

2011-10-13 Thread Frank Missel
Bart, thanks for the offer, but we found the cause of the problems in the
data type naming of the columns.

Have you found a way to avoid having to define DSN's for each individual
database?

/Frank

> -Original Message-
> From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
> boun...@sqlite.org] On Behalf Of Bart Smissaert
> Sent: 14 October 2011 01:35
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] How to use SQLite as a data source in Excel (tables
and
> pivottables)
> 
> I use SQLite extensively as a data source in Excel and have never come
across
> this problem.
> Is use Olaf Schmidt's VB wrapper vbRichClient4 and vb_cairo_sqlite.
> If you can send me a workbook that clearly demonstrates the problem then I
> can see if I can deal with it with the above wrapper. I am sure there will
be no
> problem at all.
> 
> RBS
> 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] How to use SQLite as a data source in Excel (tables and pivottables)

2011-10-13 Thread Frank Missel
> -Original Message-
> From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
> boun...@sqlite.org] On Behalf Of Simon Slavin
> 
> would something like
> 
> SELECT CAST (theValue AS REAL) AS thisValueReal FROM ...
> 
> force the driver to recognise that the value it was getting was REAL ?
> 
> Simon.

I did not have high hopes, but it did work!

That led me to wonder why that could be. I found out that it has to do with
the declared type name of the column.
"DECIMAL" is no good. However, if you declare a column as "REAL" it works
without any CAST function.
If you column is an expression, however, you still have to use the CAST
function. But at least it can be brought to work.

Worth mentioning is that the cell format is still "General" but it now
really works as a decimal (i.e. you can sum on it).

The above is surprising as Christian Werner writes the following in the
documentation:

" Since October 14th, 2001, the driver supports the data types SQL_INTEGER,
SQL_TINYINT, SQL_SMALLINT, SQL_FLOAT, SQL_DOUBLE, SQL_DATE, SQL_TIME,
SQL_TIMESTAMP, and SQL_VARCHAR."

However, for it to work in Excel, you have to define columns as one of the
following:
INTEGER
REAL
DATE
VARCHAR

I'll let him know.

--

Still, if anyone knows how to use the ADO.NET driver that could also be
interesting, since the ODBC driver is a bit bothersome in that you
apparently have to define a Data Source Name for each individual database
that you want to access -- I haven't found any way to work around that. If
one chooses just the "SQLite3 Datasource" as data source there is an error
message to the effect that the data source contains no visible tables. For
other data sources a particular database can be specified as part of the
selection process.


/Frank Missel

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] How to use SQLite as a data source in Excel (tables and pivottables)

2011-10-13 Thread Frank Missel
I am trying to use data from an SQLite database as a data source for a
Pivottable in an Excel sheet.

 

By use of Micrsoft Query and the SQLite ODBC driver I can reference a table
from an SQLite database either as a list in a worksheet or as basis for a
Pivottable. This apparently works fine. However, there is an issue with the
decimal data type which is not recognized, i.e. the cells are of the type
"General" whereas the integer and date fields are represented with the
correct cell format and function. Even if one does change the format of the
cells containing data from a decimal field to "Number", they still do not
work properly as numbers, i.e. the sum function does not work correctly.

 

I have emailed with the author of the SQLite ODBC driver, Christian Werner,
about the problem. He writes:

 

"The problem is the typelessness of SQLite. In order to obtain column
information early, a SELECT is prepared twice. The first gives the column
names and potential type information.

In the second phase the second select retrieves data. For computed columns,
SQLite usually reports SQLITE_TEXT or even SQLITE_NULL in
sqlite3_column_type. As long as an ODBC application retrieves in advance the
correct typed values of a rowset, everything is fine. But that seems not to
be the case for Excel/Query."

 

I have also installed the System.SQlite.Data ADO.NET driver in the hope that
perhaps it could be used as an OLE DB data source or other type of data
source that could be chosen from Excel. However, it does not show in the
various list of data sources so perhaps that is not possible.

 

Does anyone have experience and/or ideas about how to use SQLite as a proper
data source that can be accessed from Excel besides the ODBC driver which
has the mentioned problems with decimal fields?

 

 

/Frank Missel

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Speed of sorting - diff between integer and string

2011-10-11 Thread Frank Missel
> -Original Message-
> From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
> boun...@sqlite.org] On Behalf Of Shorty
> Sent: 12 October 2011 09:31
> 
> -- HERE IS MY QUESTION: --
> Is faster for the sqlite database to have the grocery_type as a string or
> integer?
> Or is the speed difference so small it doesn't matter?
> 
> Like instead of having "fruit" in the grocery_type column, I could have it
as a
> "1" type, vegetables would be "2", and then translate that just before
> spitting out the table to my web page.
> 

I think that the sorting itself would be faster with an integer rather than
string.
However, with the small number of items that you mention here the difference
would probably be immeasurably small. Also, if you were to use an integer it
would, as you mention, have to later be converted to a string which also
takes time.

Generally, I go with the ease of use (i.e. use a meaningful string rather
than an integer) unless it is a database where performance or storage space
is an issue, which could be an another advantage of using integers with
reference to a lookup table of strings, if e.g. your database contains other
large tables with many records referring to these fruits, vegetables or
other items.

It's somewhat similar to the issue with dates. You can store them as Julian
numbers which is more effective, but then you have the hassle of having to
use date function in the select statement for the result to be meaningful.
Thus, unless there is a storage or performance issue I just store the dates
as strings, e.g. '2011-10-12'. 

/Frank Missel


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Faulty acceptance of non-aggregate value that is notin group by part of the SELECT statement

2011-10-11 Thread Frank Missel
> -Original Message-
> From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
> boun...@sqlite.org] On Behalf Of Petite Abeille
> Sent: 12 October 2011 00:15
> 
> I wonder how much of that feature is intentional vs. accidental. To me it
> looks more like an implementation detail leak being post-rationalized as a
> feature. Just my 2ยข though.

I thought so  too, but the behaviour is documented under the select
statement:

http://www.sqlite.org/lang_select.html#resultset

Under " 3. Generation of the set of result rows."

Third bullet, third paragraph:

"If the SELECT statement is an aggregate query with a GROUP BY clause, then
each of the expressions specified as part of the GROUP BY clause is
evaluated for each row of the dataset. Each row is then assigned to a
"group" based on the results; rows for which the results of evaluating the
GROUP BY expressions are the same are assigned to the same group. For the
purposes of grouping rows, NULL values are considered equal. The usual rules
for selecting a collation sequence with which to compare text values apply
when evaluating expressions in a GROUP BY clause. The expressions in the
GROUP BY clause do not have to be expressions that appear in the result. The
expressions in a GROUP BY clause may not be aggregate expressions.

If a HAVING clause is specified, it is evaluated once for each group of rows
as a boolean expression. If the result of evaluating the HAVING clause is
false, the group is discarded. If the HAVING clause is an aggregate
expression, it is evaluated across all rows in the group. If a HAVING clause
is a non-aggregate expression, it is evaluated with respect to an
arbitrarily selected row from the group. The HAVING expression may refer to
values, even aggregate functions, that are not in the result.

Each expression in the result-set is then evaluated once for each group of
rows. If the expression is an aggregate expression, it is evaluated across
all rows in the group. Otherwise, it is evaluated against a single
arbitrarily chosen row from within the group. If there is more than one
non-aggregate expression in the result-set, then all such expressions are
evaluated for the same row.

Each group of input dataset rows contributes a single row to the set of
result rows. Subject to filtering associated with the DISTINCT keyword, the
number of rows returned by an aggregate query with a GROUP BY clause is the
same as the number of groups of rows produced by applying the GROUP BY and
HAVING clauses to the filtered input dataset.
"

But then again, it could of course still be something that was documented
after the fact.

In any case,  although I can see a certain advantage as explained earlier by
Igor (less SQL code, very small performance gain) when the a column contains
the same value for all rows in the result set, I think that the danger of
getting a wrong result (as from what you would expect) far outweighs the
advantage. The best would be a choice now that the feature is there and
expected by some.

Does anyone from SQLite Development have an opinion on this?


/Frank Missel

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Faulty acceptance of non-aggregate value that is notin group by part of the SELECT statement

2011-10-11 Thread Frank Missel
> Sometimes, you know that the value of a particular column is in fact
unique
> across the group (in which case it doesn't matter which row it's taken
from).
> This knowledge could come from invariants being maintained that are not
> perhaps formally captured in the database schema, or else flow from the
> particular join and WHERE conditions.
> 
> In such cases (which come up surprisingly often, in my experience), it's
> convenient to be able to just use the column name. I also work with MySQL
a
> bit, which doesn't allow that, so you have to wrap the column name in
min()
> or max() (doesn't matter which, as all values are the same). Personally, I
find
> it annoying. It makes the database engine do unnecessary comparisons, thus
> hurting performance (though I admit that the difference is likely to be
> immeasurably small), and more importantly, it makes the statement more
> verbose and difficult to read and understand.
> 
> Now, if there were some kind of a PRAGMA that would turn this behavior off
> and enforce stricter syntax rules, I wouldn't be against it. I'd likely
just never
> use it. Please feel free to try and convince SQLite developers (of which
I'm
> not) to add such a pragma (but don't expect me to pitch in for the cause).
> --
> Igor Tandetnik

Ok, I see that.

So its weighing the extra typing (and small performance gain) against the
prevention of an error by a mistakenly left out column.
Yeah, a pragma strict would be great.  I can't be the only one who would
rather make the queries more error proof.


/Frank Missel

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Faulty acceptance of non-aggregate value that is not in group by part of the SELECT statement

2011-10-11 Thread Frank Missel
> > Otherwise, you must enlighten me on how it could have a practical use
> > to have an arbitrary value in the group returned together with the
> > total number of records across all the groups.
> 
> It's not useful in your specific query, but it is useful in others. Yours
is not the
> only system in the world that uses SQLite, you know.
> 

Igor, I did not mean to belittle SQLite (or the very good work that you are
doing on this e-mail list).
In fact, I think it is a fantastic tool which can be used for many purposes.


When I write as I do, it is after weighing the advantage of this "feature"
against the disadvantages. 

As for the advantages, I just don't see how it could be practical to have an
arbitrary group value together with the total number of records in an
application. But sometimes a feature / error that was not intended finds its
way into an application and so it is still supported in later versions. I
guess it could be so. Or perhaps it can be useful in applications of a
different type than I have been involved in.

On the downside I often see mistakes of left out elements in group by
clauses, and then it is a blessing to have it pointed out by the compiler
rather than having an erroneous result as in the example.  This is
especially the case when the query may have a complex group by clauses of
many elements and perhaps even derived tables (e.g. select ... from (select
.. from) left join (select ..), etc. In these cases it is so very easy to
leave out a single field -- and I admit that I have sometimes been saved by
the compiler. Now the query might generate data that are used directly in
the application or perhaps even fed to further queries after storage, and so
an error like that may propagate through the system.
But even for simple queries it is a common mistake -- especially for someone
new to SQL.

Thus weighing for and against considering everybody, I find that it would be
better not to have this "feature" but rather have the compiler return an
error. Those who might be dependent upon the "feature" could have an option
to enable it. I simply find that most would benefit from that - experienced
as well people new to SQL / SQLite.

Also, I think that if you need an arbitrary value it would be better to get
that explicitly through usage of the random function. The other way is more
obscure and is bad SQL code (in my opinion ;-).

So I hope that there could be a way to not have this behaviour, but in the
meantime I will check my SELECT statements like a hawk ;-)


Best regards

Frank

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Faulty acceptance of non-aggregate value that is not ingroup by part of the SELECT statement

2011-10-11 Thread Frank Missel
> 
> SQLite allows this as an extension. When this happens, a value from an
> arbitrary row within the group is reported. This is often convenient.
> 
Ha ha, you must be joking, right?
Otherwise, you must enlighten me on how it could have a practical use to
have an arbitrary value in the group returned together with the total number
of records across all the groups.

> > The result is strange and misleading and can easily lead to data errors.
> 
> If you don't like this facility, don't use it in your queries. No one's
forcing you.
> --

Again, assuming that you are joking here :-). 
But just in case you are not:
Well, in my work with SQL across a number of databases, this is one of the
very common errors: to accidentally leave out an element in the group by
clause of the SELECT statement and so many tools give a error rather than
proceed. 
I think it should be with the same for SQLite -- at least as an option -- or
even better: an option to enable the extension with this weird behaviour
should one require it for some obscure reason.


/Frank Missel

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Faulty acceptance of non-aggregate value that is not in group by part of the SELECT statement

2011-10-11 Thread Frank Missel
Consider a table:

 

create table book (book_id integer primary key, type varchar, title
varchar);

 

and some data for it:

 

insert into book (book_id, type, title) 

values (1, 'hardback', 'book 1')

;

insert into book (book_id, type, title) 

values (2, 'softback', 'book 2')

;

 

Now execute the following SQL:

 

select type, count(*)

from book

;

 

Which gives the result: 

type   count(*)

softback   2

 

I would have expected an error message here to the effect that a non
aggregate value was not part of the grouping.

The result is strange and misleading and can easily lead to data errors.

 

The SQL statement should have been:

select type, count(*)

from book

group by type

;

 

Which gives:

type   count(*)

hardback 1

softback   1

 

 

 

Best regards,

 

Frank Missel

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Faulty acceptance of non-aggregate value that is not in group by part of the SELECT statement

2011-10-11 Thread Frank Missel
Consider a table:

 

create table book (book_id integer primary key, type varchar, title
varchar);

 

and some data for it:

 

insert into book (book_id, type, title) 

values (1, 'hardback', 'book 1')

;

insert into book (book_id, type, title) 

values (2, 'softback', 'book 2')

;

 

Now execute the following SQL:

 

select type, count(*)

from book

;

 

Which gives the result: 

type   count(*)

softback   2

 

I would have expected an error message here to the effect that a non
aggregate value was not part of the grouping.

The result is strange and misleading and can easily lead to data errors.

 

The SQL statement should have been:

select type, count(*)

from book

group by type

;

 

Which gives:

type   count(*)

hardback 1

softback   1

 

 

 

Best regards,

 

Frank Missel

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Sqlite3 transactions without Durability

2011-09-25 Thread Frank Missel
> 
> That several users accessing same DB from several processes can't perform
> more then 10 transactions per second...
> 
> > Simon

SQLite is not a good multi-user database and never have claimed to be such.
If you need very good multi-user performance, a proper client-server
database may be a better choice, e.g.:

Open Source:
PostgreSQL
MySQL
Firebird

Free but not Open Source:
MS SQL Server Express (10 Gb database, 1 CPU, 1 Gb Ram limits)
IBM DB2 Express (no limits on database size, 4 Gb Ram limit, some limit on
CPUs/Cores depending on edition)

SQLite is good for single user write access and multiple read only access,
often used as data storage for local applications. However, it is also
gaining more widespread usage as data storage / analysis database for large
amounts of data from e.g. logging or statistical / scientific research data.


/Frank

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Sqlite3 transactions without Durability

2011-09-25 Thread Frank Missel
Hi,

As suggested by others working with the pragma commands will work wonders.
For massive inserts or updates I use:

locking_mode  = exclusive
journal_mode = off
synchronous = 0
cache_size = 40 (or even higher value if you have enough memory)

When setting journal_mode off you may indeed end up with a corrupted
database in case of a crash, so I usually start such a session by taking an
OS copy of the database (which is quite fast even for gigabyte size
databases), store the file path of the copy database in a file or other
database. If the job completes the copy database is automatically removed by
my program.

As for matching the performance of more advanced databases, I find that I
can often obtain similar performance by structuring the tables, indexes and
queries to be performed. Also, since only a single connection can write to
an SQLite database at any time, working with several databases at the same
time can further speed up things -- if the work can be split up in
independent jobs which can run in parallel on several cores or CPUs.

When handling a very large number of records, e.g. log data for 2 years
which might come to as much as 80 million records, one would anyway usually
split up the data in several tables / databases to increase performance --
this is even done when using powerful databases and data warehouses as index
performance otherwise would suffer a lot. Of course,  facilities to do this
easily are built into the more powerful databases -- but if you are willing
to tinker a bit with it reasonable performance can normally be obtained.


Best regards,

Frank


> -Original Message-
> From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
> boun...@sqlite.org] On Behalf Of Artyom Beilis
> Sent: 25 September 2011 19:13
> To: sqlite-users@sqlite.org
> Subject: [sqlite] Sqlite3 transactions without Durability
> 
> Hello,
> 
> On of the things that is supported by "big" databases is non-durable
> transactions that make the database updates much faster.
> 
> For example PostgreSQL has an option synchonous_commits
> 
> and MySQLhas an option innodb_flush_log_at_trx_commit
> 
> Basically I want to do many inserts and reads using the database and I
want
> to keep ACI of ACID part. I mean I don't care that some of the
transactions
> are get lost, but I do care that the database state would remain
consistent in
> case of catastrophic fault.
> 
> The simplest solution is to start a transaction and commit it once in a
while it
> would make it **very fast** because fsync would be called only once in a
> while.
> 
> The problem is that it works for one connection only and I can't do the
same
> job from multiple connections and/or from multiple processes.
> Is there any way to solve this problem, any custom VFS or module?
> 
> 
> I understand that such option requires probably an additional thread or
> process to do this.
> 
> Thanks a lot.
> 
> Artyom Beilis
> --
> CppCMS - C++ Web Framework:   http://cppcms.sf.net/
> CppDB - C++ SQL Connectivity: http://cppcms.sf.net/sql/cppdb/
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] DISTINCT on a JOIN

2011-09-23 Thread Frank Missel
> course1   number-of-people-involved-in-course1
number-of-tasks-involved-in-course1
> course2   number-of-people-involved-in-course2
number-of-tasks-involved-in-course2
> course3   number-of-people-involved-in-course3
number-of-tasks-involved-in-course3
> course4   number-of-people-involved-in-course4
number-of-tasks-involved-in-course4

This can be produced by:

select crs.id
 , crs.name
 , (select count(id) from people pe where pe.course = crs.id) as
num_people
 , (select count(id) from tasks ta where ta.course = crs.id)as
num_tasks
from courses crs
;

Result:
14  starter  course  2   2



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users