Re: [sqlite] SQLite version 3.7.6 testing

2011-02-13 Thread Sven L

Lol... we just had a long discussion about sqlite_stat2 and ANALYZE command.
I removed these from my code, since the command should not be used except for 
rare cases :P
 
Should I re-add my ANALYZE-code and ask my original question again? Confused :P
 
 From: d...@sqlite.org
 Date: Sat, 12 Feb 2011 10:27:40 -0500
 To: sqlite-...@sqlite.org; sqlite-users@sqlite.org
 Subject: [sqlite] SQLite version 3.7.6 testing
 
 The scheduled release of SQLite version 3.7.6 is mid-April. We are still
 two months away. However, version 3.7.6 will contain a lot of pent-up
 changes and so your advance testing of this release will be appreciated.
 
 The latest pre-3.7.6 code can be found here:
 
 http://www.sqlite.org/draft/download.html
 
 And sketchy information about the changes added (so far) can be seen here:
 
 http://www.sqlite.org/draft/releaselog/3_7_6.html
 
 Of particular concern in 3.7.6 are some tweaks we've done to the query
 planner. The query planner is the part of SQLite that decides on the best
 and fastest algorithm to use to satisfy your queries. SQLite uses a
 cost-based query planner. In other words, it tries to estimate the CPU and
 disk I/O load used by various alternative algorithms and it selects the
 algorithm with the lowest estimated cost. The problem with this approach is
 that the cost really is an estimate based on incomplete information, and so
 it sometimes comes out wrong. The cost functions for 3.7.6 have been
 enhanced to be more accurate (we believe) particularly if SQLite is compiled
 using SQLITE_ENABLE_STAT2 and you run ANALYZE on your database files to
 gather statistics. But there could be instances where the cost estimates
 are not as good as before, resulting in performance regressions.
 
 So, if you have the ability to download an early snapshot of SQLite 3.7.6
 and compile it into your application for performance testing purposes, we
 would really appreciate it if you would do so. And do so sooner rather than
 later so that we can have plenty of time to fix an issues you discover prior
 to the 3.7.6 release. If the query planner in 3.7.6 improves the
 performance of your application, we'd also like to hear about that too.
 
 I'll try to post snapshots of 3.7.6 on a regular basis so that you can
 download and compile a fairly recent version at any time. Please pester me
 if I forget.
 
 Thanks for your help.
 -- 
 D. Richard Hipp
 d...@sqlite.org
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
  
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] UPDATE/INSERTing 1-2k rows slower than expected

2011-02-13 Thread Jean-Christophe Deschamps

So if the drive is 5400 rpm, 227 is much more than 5400/60=90 and even if
it's 7200 (manufacturers sometimes upgrade drives inside portable hd 
without
prior notice), it's still twice as much as 7200/60=120.

5400/60, 7200/60 ... those values rely on the assumption that 
successive LBAs are mapped to successive physical sectors (512 or 4K, 
whatever) on the same face of the same plater.  Is it obvious that all 
today's typical stock drives actually implement only that simple old 
scheme and not an untold mix of various interleaving techniques?

Yet that wouldn't explain a two- or threefold difference between 
internal SATA and USB.  That sounds surprising.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] UPDATE/INSERTing 1-2k rows slower than expected

2011-02-13 Thread Jim Wilcoxson
On Sun, Feb 13, 2011 at 1:15 AM, Max Vlasov max.vla...@gmail.com wrote:

 On Sat, Feb 12, 2011 at 7:11 PM, Jim Wilcoxson pri...@gmail.com wrote:

 
 
  Unless I'm missing something, SQLite has to update the first page of the
  database on every commit, to update the change counter.  Assuming you are
  using rotating media, that record can only be updated 120 times per
 second,
  maximum, on a 7200RPM drive.
 
 

 Hmm, I think there's something wrong with this theory of a single flush per
 round

 Yesterday I posted here results about from tests that ended with something
 like 50 commits per second in case of single flush (journal_mode=off). I
 decided to run a simple program that writes a byte to a file and does
 windows FlushFileBuffers. It reported 53 writes per second. This was
 expected results if this theory is right. But when I connected an external
 drive (WD1200E1MS, bus-powered, it seems it is even 5400 rpm), this number
 jumped to 253. I though Ok, something, maybe os maybe drive tried to fool
 me, let's do this other way. I did the following:

 -  Checked the removable drive Quick removaI is on so no write cache on
 os
 side
 -  opened the same test db on this drive, journal_mode=off for best
 scenario
 -  Executed 10,000 updates to a single record made as separated
 transactions
 ( UPDATE TestTable SET Id=Id+1 )
 - When the program reported ok, I quickly unplugged the usb cord. The
 device
 is bus-powered so had no power to complete any postponed operation. The
 total time was about 50 seconds and to do the real quick unplug I prepared
 my hands and the delay was no more than 200 ms (subjectively).
 - Replug the drive again and opened the db in the tool to see was there any
 corruption or other lost of data.

 So the final results:
 - the db was always ok and contains the correct value (id=10001 for initial
 1).
 - the speed was about 227 commits per second so very close to my system
 flush test  (253)

 So if the drive is 5400 rpm, 227 is much more than 5400/60=90 and even if
 it's 7200 (manufacturers sometimes upgrade drives inside portable hd
 without
 prior notice), it's still twice as much as 7200/60=120.


If we go with the theory that a single record/sector can only be written
once per revolution, that means writes can occur 90 times per second, or
about every 11ms on your 5400 rpm drive.  So for this test to show
corruption, you'd have to unplug the drive within 11ms of the last commit,
which I think is practically impossible.

My hypothesis to explain the behavior you are seeing is that it takes 4ms
(1/253) for your OS to flush its buffers to the hard drive cache, issue the
sync to the drive, and the hard drive to acknowledge the sync.  When it is
convenient, the drive will write this data to the disk.  The sync command
may make this more urgent than usual, which would explain why I/O slows down
if sync is used.  Or, the drive could simply have a sync timer: the first
sync command starts the timer, and when 4ms have passed, the drive dumps its
cache to the platters.

Jim
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] UPDATE/INSERTing 1-2k rows slower than expected

2011-02-13 Thread Jim Wilcoxson
On Sun, Feb 13, 2011 at 7:07 AM, Jean-Christophe Deschamps j...@antichoc.net
 wrote:


 So if the drive is 5400 rpm, 227 is much more than 5400/60=90 and even if
 it's 7200 (manufacturers sometimes upgrade drives inside portable hd
 without
 prior notice), it's still twice as much as 7200/60=120.

 5400/60, 7200/60 ... those values rely on the assumption that
 successive LBAs are mapped to successive physical sectors (512 or 4K,
 whatever) on the same face of the same plater.  Is it obvious that all
 today's typical stock drives actually implement only that simple old
 scheme and not an untold mix of various interleaving techniques?



This is true for the case of writing multiple records or multiple sectors.
For example, if you have a drive with 5000 sectors per track and you write
sector 1 of the track with a hard sync, you may have time to write sector
2500 with a hard sync in the same revolution.  Or maybe you can write every
500 sectors with a hard sync in the same revolution, giving you a commit
rate of 10 per revolution or 900 commits per second on a 5400 rpm drive.

But what I postulate is that you can't physically write *the same* record
over and over more than 90 times per second on a 5400 rpm drive, unless the
drive, OS, or filesystem implements something like wear-leveling, where the
physical location of sectors is constantly changing.

Jim
--
HashBackup: easy onsite and offsite Unix backup
http://www.hashbackup.com
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Query optimizer with attached databases

2011-02-13 Thread Felix Schlesinger
Hello,

I have come across several situations in which the sqlite query optimizer
chooses clearly suboptimal query plans when a query spans several attached
databases. If I copy all involved tables into one sqlite database file, a
better query plan is chosen. The difference can be quite dramatic (orders of
magnitude faster). 

Unfortunately it does not seem to happen in very simple databases with small
datasets, so I do not have a simple self-contained example I could upload, but
one general situation that goes wrong is

database a: 
table a (id primary key, otherCOLS) table va(aid references a(id), foo)

database b: 
table b (id primary key, otherCOLS) table vb(bid references b(id), foo)

database c: 
table ab (aid, bid) 
 --references tables a.a and b.b 
 --indexed on aid and bid contains a smallish subset of all lines from a and b

query: 
select count(*) 
  FROM ab 
  JOIN a ON(a.id = ab.aid) 
  JOIN b ON(b.id = ab.bid) 
  JOIN va ON(va.aid = a.id) 
  JOIN vb ON(vb.bid = b.id) 
WHERE va.foo = 'rare value' AND vb.foo = 'rare value';

The optimizer chooses to create an automatic index on vb and then does the
query in order va, vb, a, b, ab However reordering to e.g.  ab, a, va, b, vb is
much faster (if I either enforce it by hand or import all tables into one db).

Are there known issues with query optimization across different databases or
maybe performance tips? I ran analyze on all DBs. I use sqlite version 3.7.5 on
Linux (x86_64).

Thanks for this great software and any tips you might have about this issue.


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] UPDATE/INSERTing 1-2k rows slower than expected

2011-02-13 Thread Max Vlasov
On Sun, Feb 13, 2011 at 5:31 PM, Jim Wilcoxson pri...@gmail.com wrote:

 On Sun, Feb 13, 2011 at 1:15 AM, Max Vlasov max.vla...@gmail.com wrote:
  So the final results:
  - the db was always ok and contains the correct value (id=10001 for
 initial
  1).
  - the speed was about 227 commits per second so very close to my system
  flush test  (253)
 
  So if the drive is 5400 rpm, 227 is much more than 5400/60=90 and even if
  it's 7200 (manufacturers sometimes upgrade drives inside portable hd
  without
  prior notice), it's still twice as much as 7200/60=120.
 

 My hypothesis to explain the behavior you are seeing is that it takes 4ms
 (1/253) for your OS to flush its buffers to the hard drive cache, issue the
 sync to the drive, and the hard drive to acknowledge the sync.  When it is
 convenient, the drive will write this data to the disk.  The sync command
 may make this more urgent than usual, which would explain why I/O slows
 down
 if sync is used.  Or, the drive could simply have a sync timer: the first
 sync command starts the timer, and when 4ms have passed, the drive dumps
 its
 cache to the platters.



Jim, I see your point,  maybe really for removable media on Windows sync is
still lazy, but much less lazy then general cached operations. Another
version is that removable hds can report that they wrote some data while
actually work as your described postonning it a little.

But I thought about how it would be possible to test this explanation . I'm
going to do some tests that works like this. The same id updating, but in
the middle of 10,000 operation I will unplug the cord, the sqlite will say
that it can't commit particular insert and I can see what is the value of
this insert. After replugging two variants possible:
- I will find the previous value in the base. If for several dozens tests
the result is the same, that would mean that the situation is still needed
to be explained.
- I will find some value less than previous to the one sqlite could not
commit at least in some of the tests (maybe -3, -4 to the failed one). In
this case the explanation will be confirmed.

How about this? Does it sound reasonable?

Max Vlasov
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] UPDATE/INSERTing 1-2k rows slower than expected

2011-02-13 Thread Jean-Christophe Deschamps

But what I postulate is that you can't physically write *the same* record
over and over more than 90 times per second on a 5400 rpm drive, 
unless the
drive, OS, or filesystem implements something like wear-leveling, 
where the
physical location of sectors is constantly changing.

It's still possible if the firmware lies about physical sync and, 
having noticed that you repeatedly overwrite the same LBA, just 
postpones writes in its queue and only actually flushes the last one.

By precisely monitoring input power levels, the drive firmware knows 
how much it can write before its internal power reserves drop below 
unreliable levels and it also knows what's left in its write queue, 
accounting for every internal parameter.  So it's still possible that 
some drives give you a much higher rate than our simple-minded theory 
predicts.

Given the low cost of computing power now embarked in most devices, 
such behavior is quite possible.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] basic rtree question

2011-02-13 Thread Puneet Kishor
I have a simple table like so

CREATE TABLE points (
id INTEGER PRIMARY KEY,
lon REAL,
lat REAL,
tile INTEGER
);

It has about 13.25 million rows. I want to be able to return rows given a 
bounding box (min_lon, min_lat, max_lon, max_lat). Is the following the right 
strategy?

CREATE VIRTUAL TABLE points_rtree USING rtree (
id, min_lon, min_lat, max_lon, max_lat
);

and then querying like so

SELECT p.id, tile 
FROM points p JOIN points_rtree pr ON p.id = pr.id 
WHERE min_lon = -91 AND max_lon = -89 AND min_lat = 43 AND max_lat = 45;



-- 
Puneet Kishor 



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] UPDATE/INSERTing 1-2k rows slower than expected

2011-02-13 Thread Jim Wilcoxson
On Sun, Feb 13, 2011 at 11:55 AM, Max Vlasov max.vla...@gmail.com wrote:

 On Sun, Feb 13, 2011 at 5:31 PM, Jim Wilcoxson pri...@gmail.com wrote:

  On Sun, Feb 13, 2011 at 1:15 AM, Max Vlasov max.vla...@gmail.com
 wrote:
   So the final results:
   - the db was always ok and contains the correct value (id=10001 for
  initial
   1).
   - the speed was about 227 commits per second so very close to my system
   flush test  (253)
  
   So if the drive is 5400 rpm, 227 is much more than 5400/60=90 and even
 if
   it's 7200 (manufacturers sometimes upgrade drives inside portable hd
   without
   prior notice), it's still twice as much as 7200/60=120.
  
 
  My hypothesis to explain the behavior you are seeing is that it takes 4ms
  (1/253) for your OS to flush its buffers to the hard drive cache, issue
 the
  sync to the drive, and the hard drive to acknowledge the sync.  When it
 is
  convenient, the drive will write this data to the disk.  The sync command
  may make this more urgent than usual, which would explain why I/O slows
  down
  if sync is used.  Or, the drive could simply have a sync timer: the first
  sync command starts the timer, and when 4ms have passed, the drive dumps
  its
  cache to the platters.
 


 Jim, I see your point,  maybe really for removable media on Windows sync is
 still lazy, but much less lazy then general cached operations. Another
 version is that removable hds can report that they wrote some data while
 actually work as your described postonning it a little.

 But I thought about how it would be possible to test this explanation . I'm
 going to do some tests that works like this. The same id updating, but in
 the middle of 10,000 operation I will unplug the cord, the sqlite will say
 that it can't commit particular insert and I can see what is the value of
 this insert. After replugging two variants possible:
 - I will find the previous value in the base. If for several dozens tests
 the result is the same, that would mean that the situation is still needed
 to be explained.
 - I will find some value less than previous to the one sqlite could not
 commit at least in some of the tests (maybe -3, -4 to the failed one). In
 this case the explanation will be confirmed.

 How about this? Does it sound reasonable?


Sounds like a clever test to me!

I also found this page, used to force a Linux system crash:

http://evuraan.blogspot.com/2005/06/how-to-force-paniccrash-in-linux.html

I seem to remember a post that SQLite commit/sync is tested with the kill
command, but it seems like actually crashing the machine (which is done
without a sync first) might be a better test.

Jim
--
HashBackup: easy onsite and offsite Unix backup
http://www.hashbackup.com
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] UPDATE/INSERTing 1-2k rows slower than expected

2011-02-13 Thread Simon Slavin

On 13 Feb 2011, at 5:53pm, Jim Wilcoxson wrote:

 I also found this page, used to force a Linux system crash:
 
 http://evuraan.blogspot.com/2005/06/how-to-force-paniccrash-in-linux.html
 
 I seem to remember a post that SQLite commit/sync is tested with the kill
 command, but it seems like actually crashing the machine (which is done
 without a sync first) might be a better test.

You can do extensive experimentation an find out exactly how one setup acts.  
One particular hard disk, motherboard, operating system, and (disk) device 
driver.  But change any variable, even upgrade to a later version of the OS or 
device driver, and you can get different behaviour.  Same thing if you move the 
jumpers on the hard disk or upgrade its firmware.  So if you test your 
programming on your own computer but send your application to customers you 
don't know how it'll perform.

At the level you're worrying about, you're learning too much about a very 
narrow subject.  If you need a general solution you have to write for the 
specs, not one particular setup.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] basic rtree question

2011-02-13 Thread Jay A. Kreibich
On Sun, Feb 13, 2011 at 11:47:48AM -0600, Puneet Kishor scratched on the wall:
 I have a simple table like so
 
 CREATE TABLE points (
 id INTEGER PRIMARY KEY,
 lon REAL,
 lat REAL,
 tile INTEGER
 );
 
 It has about 13.25 million rows. I want to be able to return rows
 given a bounding box (min_lon, min_lat, max_lon, max_lat). Is the
 following the right strategy?
 
 CREATE VIRTUAL TABLE points_rtree USING rtree (
 id, min_lon, min_lat, max_lon, max_lat
 );

  Almost.  You need to match min/max pairs, so the column order would
  need to be:

( id, min_lon, max_lon, min_lat, max_lat )

  ...or... 

( id, min_lat, max_lat, min_lon, max_lon )

 and then querying like so
 
 SELECT p.id, tile 
 FROM points p JOIN points_rtree pr ON p.id = pr.id 
 WHERE min_lon = -91 AND max_lon = -89 AND min_lat = 43 AND max_lat = 45;

  Yes, that's the general idea.  Since the R-Tree column can only hold
  an ID column (in addition to the min/max pairs), it is normally
  JOINed against a standard table with  more specific information.

  When using lat/lons, be sure to account for bounding boxes that cross
  the +180/-180 meridian.  This can be done by taking the union of a
  box on each side of the anti-meridian.

   -j

-- 
Jay A. Kreibich  J A Y  @  K R E I B I.C H 

Intelligence is like underwear: it is important that you have it,
 but showing it to the wrong people has the tendency to make them
 feel uncomfortable. -- Angela Johnson
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] basic rtree question

2011-02-13 Thread Puneet Kishor


On Sunday, February 13, 2011 at 4:15 PM, Jay A. Kreibich wrote: 
 I have a simple table like so
  
  CREATE TABLE points (
  id INTEGER PRIMARY KEY,
  lon REAL,
  lat REAL,
  tile INTEGER
  );
  
  It has about 13.25 million rows. I want to be able to return rows
  given a bounding box (min_lon, min_lat, max_lon, max_lat). Is the
  following the right strategy?
  
  CREATE VIRTUAL TABLE points_rtree USING rtree (
  id, min_lon, min_lat, max_lon, max_lat
  );
 
 Almost. You need to match min/max pairs, so the column order would
 need to be:
 
 ( id, min_lon, max_lon, min_lat, max_lat )
 
 ...or... 
 
 ( id, min_lat, max_lat, min_lon, max_lon )
  Fascinating. You mean, the order of the columns in the table definition is 
 important? I did not realize that would be important. 

Puneet. 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Question about having 2 db connections on same thread

2011-02-13 Thread jeff archer
I thought this would be OK but now that I am seeing it fail I am having my 
doubts.  


I have created 2 db connections on the same thread, both on the same file.  
i.e. 
2 calls to sqlite3_open16() (use exact same filename both times) have created 2 
sqlite3 instances.  I do this because I have 2 classes which each have their 
own 
set of tables in this single db file.  I thought it would be OK because I do 
not 
intemix operations on the 2 connections.

I do a transaction on connection 1 using BEGIN IMMEDIATE, some rows updated, 
COMMIT.
Then, when I attempt same sequence on connection 2, when I do first call to 
step 
to execute begin transaction and it never returns.

 Jeff Archer
Nanotronics Imaging
jsarc...@nanotronicsimaging.com
330819.4615 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Speed up count(distinct col)

2011-02-13 Thread BareFeetWare
On 13/02/2011, at 1:04 AM, Yuzem wrote:

 I am grabbing the data from the each movie imdb webpage.

Does IMDB allow use of their data this way? After my brief reading of their 
site, I thought they charge a $15k minimum per year for data.

 The script is written in bash and I can give you the code but I think it 
 would be very complicated to adapt it to your schema.

Complicated? Why? In what format is the grabbed data (ie what tables, columns, 
rows)? It usually just takes an insert select to move data from one table's 
schema to another.

 You count directors like this:
 
 select sum(Count) from Capacity People Statistics
 where Capacity_ID = (select ID from Capacity where Name = 'director')
 ;
 
 Are you sure that this count how many directors there are?

Well, yes, but it depends of the definition of how many directors there are. 
The above counts how many directors there are in total, counting the same 
person for each movie they direct.

What you want, I think, however, is how many people there are who are directors 
(counting each person just once, even if they direct multiple movies), given by 
this:

select count(*) from Capacity People Statistics
where Capacity_ID = (select ID from Capacity where Name = 'director')
;

 If I understand it correctly (probably I don't) you have for example table
 Movie People:
 1|director|1
 2|director|1
 3|director|2
 
 There are 2 directors and then in Capacity People Statistics you should
 have:
 director|1|2
 director|2|1
 
 If you use the previous code:
 select sum(Count) from Capacity People Statistics
 where Capacity_ID = (select ID from Capacity where Name = 'director')
 ;
 
 I think it will  return 3 but there are only 2 directors.

Correct. That query will tell you that there are three (3) directors in the 
database. But two (2) of those directors are the same person.

To instead get what you want, the number of people who are directors, do this 
(repeat of above SQL):

select count(*) from Capacity People Statistics
where Capacity_ID = (select ID from Capacity where Name = 'director')
;

which gives your desired answer of 2.

 Sorry, but I still don't understands it because I don't understand that you 
 can count directors that way.

Hopefully now that I've given you the query you actually needed, it now makes 
sense ;-)

 I expect the above to be about the same speed or faster... than separate 
 tables, but far more flexible (eg no need to add a table to accommodate a 
 new capacity), and better normalized.
 

 But using different tables provides an instant result, you can try it with
 any table:
 SELECT count(*) from table;

Let me know if you see any noticeable difference in speed.

 In the other hand you are right, it is less flexible.

 Oh wait... I think I got it. I need two more tables:

No, I don't think you need more tables to achieve the above.

Thanks,
Tom
BareFeetWare

 --
Comparison of SQLite GUI tools:
http://www.barefeetware.com/sqlite/compare/?ml

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Speed up count(distinct col)

2011-02-13 Thread Random Coder
On Feb 13, 2011, at 3:38 PM, BareFeetWare list@barefeetware.com wrote:

 On 13/02/2011, at 1:04 AM, Yuzem wrote:
 
 I am grabbing the data from the each movie imdb webpage.
 
 Does IMDB allow use of their data this way? After my brief reading of their 
 site, I thought they charge a $15k minimum per year for data.

They do allow some limited free use of their data, though I doubt they'll be 
happy with users scrapping their site, I'd imagine they prefer devs download 
the database files they provide:

http://www.imdb.com/interfaces

 
 
 
 
 
 
 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Question about having 2 db connections on same thread

2011-02-13 Thread Vannus
Are you not able to use the same db connection in both classes?

something like
FirstTablesClass-dbConnection = MyDBConnectionClass
SecondTablesClass-dbConnection = MyDBConnectionClass

On 13 February 2011 23:04, jeff archer jarch...@yahoo.com wrote:

 I thought this would be OK but now that I am seeing it fail I am having my
 doubts.


 I have created 2 db connections on the same thread, both on the same file.
 i.e.
 2 calls to sqlite3_open16() (use exact same filename both times) have
 created 2
 sqlite3 instances.  I do this because I have 2 classes which each have
 their own
 set of tables in this single db file.  I thought it would be OK because I
 do not
 intemix operations on the 2 connections.

 I do a transaction on connection 1 using BEGIN IMMEDIATE, some rows
 updated,
 COMMIT.
 Then, when I attempt same sequence on connection 2, when I do first call to
 step
 to execute begin transaction and it never returns.

  Jeff Archer
 Nanotronics Imaging
 jsarc...@nanotronicsimaging.com
 330819.4615
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Question about having 2 db connections on same thread

2011-02-13 Thread Simon Slavin

On 14 Feb 2011, at 12:30am, Vannus wrote:

 Are you not able to use the same db connection in both classes?
 
 something like
 FirstTablesClass-dbConnection = MyDBConnectionClass
 SecondTablesClass-dbConnection = MyDBConnectionClass

Nevertheless, having two connections should work too.  His description doesn't 
include anything that looks obviously wrong to me.  I don't see why he's having 
the problem.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Question about having 2 db connections on same thread

2011-02-13 Thread Pavel Ivanov
 I do a transaction on connection 1 using BEGIN IMMEDIATE, some rows updated,
 COMMIT.
 Then, when I attempt same sequence on connection 2, when I do first call to 
 step
 to execute begin transaction and it never returns.

Do you check result code from COMMIT and is it successful? My guess is
you have some SELECT statements which are not reset or finalized
before COMMIT. And your COMMIT in such situation should return
SQLITE_BUSY. If that's true then second BEGIN IMMEDIATE just can't
finish because another writing transaction is not finished yet.


Pavel

On Sun, Feb 13, 2011 at 6:04 PM, jeff archer jarch...@yahoo.com wrote:
 I thought this would be OK but now that I am seeing it fail I am having my
 doubts.


 I have created 2 db connections on the same thread, both on the same file.  
 i.e.
 2 calls to sqlite3_open16() (use exact same filename both times) have created 
 2
 sqlite3 instances.  I do this because I have 2 classes which each have their 
 own
 set of tables in this single db file.  I thought it would be OK because I do 
 not
 intemix operations on the 2 connections.

 I do a transaction on connection 1 using BEGIN IMMEDIATE, some rows updated,
 COMMIT.
 Then, when I attempt same sequence on connection 2, when I do first call to 
 step
 to execute begin transaction and it never returns.

  Jeff Archer
 Nanotronics Imaging
 jsarc...@nanotronicsimaging.com
 330819.4615
 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] BUG: capi3e.test when compiled with SQLITE_OMIT_UTF16

2011-02-13 Thread Shane Harrelson
Thanks.Hopefully corrected here:  http://www.sqlite.org/src/info/b04304b967

-Shane


On Fri, Feb 11, 2011 at 4:59 PM, Noah Hart n...@lipmantpa.com wrote:

 capi3e.test needs
      ifcapable utf16 logic before capi3e-2.1.$i
 to properly pass tests when compiled with SQLITE_OMIT_UTF16

 ~ Noah Hart


 --
 View this message in context: 
 http://old.nabble.com/BUG%3A-capi3e.test-when-compiled-with-SQLITE_OMIT_UTF16-tp30905474p30905474.html
 Sent from the SQLite mailing list archive at Nabble.com.

 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] basic rtree question

2011-02-13 Thread Jay A. Kreibich
On Sun, Feb 13, 2011 at 04:18:25PM -0600, Puneet Kishor scratched on the wall:
 On Sunday, February 13, 2011 at 4:15 PM, Jay A. Kreibich wrote: 

   CREATE VIRTUAL TABLE points_rtree USING rtree (
   id, min_lon, min_lat, max_lon, max_lat
   );
  
  Almost. You need to match min/max pairs, so the column order would
  need to be:
  
  ( id, min_lon, max_lon, min_lat, max_lat )

 Fascinating. You mean, the order of the columns in the table definition
 is important? I did not realize that would be important. 

  Yes, it is very important.  The power of R-Trees comes from their
  ability to work on ranges of data.  They index objects that occupy a
  range (e.g. they have a start/stop or min/max value) within a
  dimension.  Although the most common use is to index 2D or 3D space,
  you might also use a 1D R-Tree to index events (with a start time and
  stop time) within a timeline.  It might not seem like much, but this
  is something that is extremely difficult to do efficiently with
  standard database indexes.

  SQLite R-Trees support one to five dimensions.  If the table has an
  ID column, in addition to a min/max value for each dimension, that
  means tables must have an odd number of columns from 3 and 11.
  The module assumes the first column is the ID column, and each pair
  of additional columns is a min/max value pair in a specific
  dimension.  The min/max values must be given as pairs, and the min
  value must come first, although it is permissible to set both the min
  and max column to the same value.  This can be used to encode points.

   -j

-- 
Jay A. Kreibich  J A Y  @  K R E I B I.C H 

Intelligence is like underwear: it is important that you have it,
 but showing it to the wrong people has the tendency to make them
 feel uncomfortable. -- Angela Johnson
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] UPDATE/INSERTing 1-2k rows slower than expected

2011-02-13 Thread Max Vlasov
On Sun, Feb 13, 2011 at 8:53 PM, Jim Wilcoxson pri...@gmail.com wrote:

  But I thought about how it would be possible to test this explanation .
 I'm
  going to do some tests that works like this. The same id updating, but in
  the middle of 10,000 operation I will unplug the cord, the sqlite will
 say
  that it can't commit particular insert and I can see what is the value of
  this insert. After replugging two variants possible:
  - I will find the previous value in the base. If for several dozens tests
  the result is the same, that would mean that the situation is still
 needed
  to be explained.
  - I will find some value less than previous to the one sqlite could not
  commit at least in some of the tests (maybe -3, -4 to the failed one). In
  this case the explanation will be confirmed.
 
  How about this? Does it sound reasonable?
 
 
 Sounds like a clever test to me!



Thanks for supporting :) now the results.

I switched off the journal:
  PRAGMA journal_mode=OFF;
As I described, the only record contained id=1 and the sql query was
  UPDATE TestTable SET Id=Id + 1
The cord was unplugged in the middle of the 10,000 operations when about
5000 records were updated. The hd was bus-powered external hard drive and I
repeated the test several times. No test showed expected value that confirms
the Jim's explanation about postponed writing. The actual values are below
(the first one is the expected value to be found after reopening, the next
one is actual value and the difference is self-explainable)

5094 - 5086 = -8
5084 - 5083 = -1
5070 - 5049 = -21
5082 - 5069 = -13
5095 - 5086 = -9
5072 - 5033 = -39
5118 - 5053 = -65
5081 - 5075 = -6

So the maximum of non-flushed commits was 65

I also made a couple of tests with journal on and see what is the difference
between expected, non-restored value, and restored value. One of the
results:
5078 (expeced) - 5077 (non restored) - 5021 (restored)
It seems that for correctly operating os/hardware the numbers should be
5078-5077-5077 or no journal present depending on the moment of
unplugging. So this postponing also made possible existing of some prior
version of journal file.

So, the next question was 'where?'. Is this software or hardware to blame.
Richard always mentioned hardware in this context, I decided to check this
with another device. This time it was a harddisk box with separated bus and
power (the hd was Maxtor IDE 80Gb drive).

The first variation was similar to the one with bus-powered, this time I
used hard button on the box that is equivalent to unplugging both connection
and power. The difference for a single test was really impressive 5355 -
4445 = -910. And when I calculated numbers for non-interrupted test the
drive showed about 300 commits per second.

The second variation was just unplugging the cord but keeping the power
intact, so if it's drive that caches, it would end its operations
completely. This time the results were perfect, for example 4822 - 4822,
and even 5371 - 5372 = +1 that actually would mean the process was
interrupted after all data is written but before sqlite made winsync, os
reported failure, but the data was already there.

So the sad news about faulty hardware is probably true once again.

Max Vlasov
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users