Re: [sqlite] What wrapper for SQLite?

2006-06-20 Thread Micha Bieber
 A lot of the time using the vanilla API keeps you from coding in the
 style most suited for a modern lang, which may only help introduce
 bugs which the dev isn't used to worrying about.  I don't think it's
 right to recommend using it when there are wrappers tailored to keep
 the dev thinking in their own lang.

 You certainly don't gain anything by using the flat API over a
 wrapper, unless the wrapper is bloated/buggy!

Problem with wrappers is, most of them are unfinished. The authors play
with them to get some things work and forget about it after their
problem has been solved. Minimal and complete libraries are non-trivial.
Especially annoying is the situation, where the developer made a
decision to use the wrapper and becomes confronted with missing
functionality at some point in the future. The required thorough
evaluation of the wrappers limitations might have been better spent in
writing your own version ...
Additionally, maintenance problems and dependencies (commercial or not)
might occur, mostly things probably avoided from users of software in
the style of sqlite (KISS as someone cited yet).

Not touching Delphi for almost ten years - why not writing a minimal
wrapper and extending step by step. I do it with my C++ wrappers
routinely. It was most of the time the better solution in most aspects
(time, simplicity).

Micha  
-- 



Re: [sqlite] SQLite performance for 10 Million Records

2006-06-19 Thread Micha Bieber
Monday, June 19, 2006, 07:37:22, Manzoor Ilahi Tamimy wrote:

 The Database Size is  more than 500 MB.
 It contain one table and about 10 million Records.

I had problems with even more records (roughly 25 million,  1GB of
data) and I've stopped efforts to do it in pure sqlite in the end, also
because datasets with even more data (10 GB) are foreseeable.
Anyway, the problem has lead to another solution. In _my case_ the bulky
data are relatively simple and access to them required not anything, SQL
has to offer. So, hdf5 (http://hdf.ncsa.uiuc.edu/HDF5/) for the mass
data + sqlite for the more sophisticated (but much smaller) tables play pretty
well together. E.g, the hdf5 library was able to write a complete 1.2 GB
file in 25 s - and file I/O becomes a bottleneck for sqlite then.
But when analyzing your problem have in mind, hdf5 has other
limitations. Inserts and even appends are not easily achieved in hdf5.
Even so, not every read operation. It's still great, when having
multidimensional data in a sense, scientific communities use them.
You can select so called hyperslabs from these fields very, very
quickly.

Micha  
-- 



Re: [sqlite] create unique index quickly

2006-05-22 Thread Micha Bieber
Monday, May 22, 2006, 15:17:21, Jay Sprenkle wrote:

 On 5/22/06, Dennis Jenkins [EMAIL PROTECTED] wrote:
 Brannon King wrote:
  The benefits I'm trying to get out of sqlite are the data queries. I
  collect a large, sparse 2D array from hardware. The hardware device is
  giving me a few GB of data data at 200MB/s. Future hardware versions
  will be four times that fast and give me terabytes of data. After I
  have the data, I then have to go through and make calculations on
  sub-boxes of that data. (I'll post some more about that in a different
  response.) I was trying to avoid coding my own
  sparce-matrix-file-stream-mess that I would have to do if I didn't
  have a nice DB engine. I think sqlite will work. I think it will be
  fast enough. I'll have some nice RAID controllers on the production
  machines with 48-256MB caches.

 Hello Brannon,

 I am simply curious.  This sounds like an amazing engineering
 challenge.  If it is not a secret, can you describe what this data
 represents and how it will be used?

 Me too!

Me too too :)

_Personally_ I think, this sounds like a task not quite fitting in
sqlites (probably any 'standard' databases) realm. This is a bit
off-topic in this group, but because you mention sub-boxes - did you
ever look into more specialized file-formats like HDF5:

http://hdf.ncsa.uiuc.edu/HDF5/

with support for certain (of course limited) queries ?

Micha  
-- 



Re: [sqlite] Index syntax for attached database

2006-05-21 Thread Micha Bieber
Sunday, May 21, 2006, 02:10:31, Brannon King wrote:

 The documentation says to put the database name on the front of the 
 index name, not the table name when using the create index command. I 
 thought it was weird myself.

Thank you, it has done the trick! I missed this in CREATE's
documentation, concentrating on ATTACH exclusively.
Beside this, I find it still a bit counter-intuitive too ...

Micha
-- 



Re: [sqlite] sqlite3 struct etc

2006-05-20 Thread Micha Bieber
 It is not a smart technique to assume that you can access underlying
 data structures and expect them to remain identically placed between 
 releases.

What do you think, was my reason to ask, whether these structures are public
and stable or not ... ;-)

Micha  
-- 



[sqlite] Index syntax for attached database

2006-05-20 Thread Micha Bieber
I'm receiving sql errors like the following

no such table: main.phs_matrices_1

when trying to create an index using the syntax:

CREATE INDEX IF NOT EXISTS phsm_1_idx ON phs_matrices_1 (a, b, c)

from my C++ program.

The error statement is true - phs_matrices_1 doesn't belong to main,
but some attached database.
Trying to change the statement by full qualifying the attached table
( ... abc.phs_matrices_1 ...) causes a syntax error near '.'.
The strange thing is, the table exists definitively yet and inserting
data etc. works at this moment (also with shortened table names, as it
has been stated in the documentation). So, the database should be
properly attached.

This is sqlite 3.3.5

Has anyone an idea, what went wrong here ?

Micha
-- 



[sqlite] sqlite3 struct etc

2006-05-18 Thread Micha Bieber
@list

Maybe I'm too C++ biased - but what is the state of the sqlite3 and
similar - e.g. 'Db' - structures in sqlites C-Interface ? Is this
considered 'public' and also stable or indicates the missing
documentation (at least I've found nothing apart from the sources) not to
use them in user code otherwise than as opaque pointer for the sqlite
interface ?

Micha
-- 



Re: [sqlite] sqlite3 struct etc

2006-05-18 Thread Micha Bieber
Thursday, May 18, 2006, 17:36:53, Jay Sprenkle wrote:

 Since you pass that in to begin with, why do you need the database
 to provide information you already have?

Thats by design (tm), but it might be not the best one. For performance reasons
I have splitted my project in a way, requiring  2 database files.
The first one holding all sort of tricky information. This database is
pretty small. The second one holds somewhat bulky amounts of data.
Several millions of rows distributed in 200 tables.
I have to care for everything I do with this db. So also deleting the
poor thing is at times the best solution todo anything in time.
Of course both tables are logically intertwined and I have all sorts of
different requirements - open/creating/updating the one, but not the other, etc.
Attaching the big one to the small table. At different times not only
during creation.
For this, it helps to link the file name of one of the databases as a
special table entry into the other. To do so, it would be nice to have
access to the file name at random times.

Micha  
-- 



[sqlite] Drop table performance issue

2006-05-15 Thread Micha Bieber
Hi list,

I'm running in a performance bottleneck in the following situation:

Database ~1.2 GB, sqlite 3.3.5, WinXP
~200 tables [phs_matrices_1 - phs_matrices_200] building the bulk of the
database. No AUTOVACUUM has been set.
Trying to drop these tables, the whole action requires ~2:45 min. The
following C++ code snippet (hopefully self-explanatory) has been used:

  db_.sql(begin);

  // Iterates over all the mentioned phs_matrices tables
  for (unsigned i=0; i!=records; ++i)
  {
if (!db_.sql(QString(drop table phs_matrices_%1).arg(i)
{
  db_.sql(commit);
  return false;
} 
  }
  db_.sql(commit);

The table rows themselves have the following scheme
[primary_key, INEGER, INTEGER, INTEGER, INTEGER, REAL, REAL, REAL]

Is there a quicker way to get rid of a bunch of tables or I'm wrong
in something unseen (by me) ?

Micha
-- 



Re: [sqlite] Possible bug with non-standard characters in column names

2006-05-11 Thread Micha Bieber
Thursday, May 11, 2006, 09:36:17, Preston  Chrystie wrote:

 if your first statement after creating the database is:
 PRAGMA encoding = UTF-16;

 then the error you get is slightly different:
sqlite ALTER TABLE test1 ADD straße VARCHAR(255);
 SQL error: malformed database schema - near (: syntax error

 I was hoping that would fix it for you.. guess not, but at least you
 know of the potential error sooner rather than later...tried the
 column name in quotes too.. didn't help.

 --preston


 On 5/10/06, [EMAIL PROTECTED]
 [EMAIL PROTECTED] wrote:
 Not sure- I've tried this through JDBC and the command line
 client, I'm not a developer at that level :(

 Thanks,
 Andreas

 -Original Message-
Why not using simply 'Strasse'. I'm german too. IMO, the most
hassle-free way doing these things lies in avoiding encoding issues, if
you can do so. A database is often a 'hidden layer', a good interface
should be able to transform potentially visible parts into something
more perfect (like 'Straße' for an UI).

Micha  
-- 



Re: [sqlite] planning to use sqlite for a project involving huge data

2006-05-03 Thread Micha Bieber
Wednesday, May 3, 2006, 06:42:34, E Tse wrote:

 I am planing to use sqlite for a project involving huge data. The planned
 sqlite file (which will contain a single table) will grow to more than 4g in
 size. Has anyone try that before and what problems have you encountered?

Be prepared, that your planned design might become subject of change.
I've a 1 GB file here, initially populated mainly by one big table
containing 25 millon records (3 floats, 5 integers). For *my* problem
splitting the table in 200 smaller ones caused a big performance gain.
All things are relevant in the case of big tables. File system issues
(as mentioned), but also index creating (when, if any; for which
entities), statement optimizing (the requests themselves as well as the
handling of possible huge answer sets from SELECT etc.).
Without some knowledge regarding your problem domain I'm afraid no
reasonable analysis is possible besides very general remarks. Caring for
the latter ones is not difficult, but the solution of your problem lies
most of the time in finding a suiting structure.
That said, also my splitting was in no way arbitrary, it followed from
certain conditions of the problem behind.

Micha  
-- 



[sqlite] sqlite3_progress_handler, sqlite3_commit_hook

2006-04-30 Thread Micha Bieber
@list

Following the documentation, the two functions have been declared
'experimental'. Does someone know, how reliable these calls are actually
? There are specific serious open issues, rough estimates when to fix
them, etc. ?

Micha



Re: [sqlite] How to configure without 'configure' script?

2006-04-03 Thread Micha Bieber
 Hi,

 I would like to compile the SQLite 3 sources directly into my app.
 Since I'm using Visual C++ 6 how can I configure the sources to allow
 me to compile them?

 Can I use MSYS and run /configure and then just copy the
 'configured' sources to my Visual C++ app and compile them there? Or
 is there some other way to configure the sources by simply defining
 some #define's somewhere?

Beside the fact that I create the relevant dsp/vcproj files with qmake
from Trolltechs Qt distribution, only one thing seems necessary to me.
Remove the shell.c and tcl stuff sources (forget the actual file name)
from the distribution. The compiler spits lot of warnings (it's not the
important stuff - 'conflict between signed, unsigned type' sort - but
maybe someone could fix this ?), but will eventually create all object
files finely

Micha  
-- 



Re: [sqlite] Performance database design

2006-03-23 Thread Micha Bieber
Thursday, March 23, 2006, 03:44:11, Teg wrote:

 Hello Jay,

 Best way I've found to get great performance out of strings and
 vectors is to re-use the strings and vectors. String creation speed is
 completely dependent on allocation speed so, by re-using the strings,
 you only grow the ones that aren't already big enough to hold the new
 string data so, eventually they don't grow at all.

 At least with STLPort STL which I use, a clear doesn't de-allocate
 the space then re-use simply fills in the already allocated space.

I can second that in principle. In my case it was still the first step
only (albeit with very notable effect). But in the end it was not as
fast as a) not using vectors at all and b) (even more impact), not to
convert doubles, int's etc. into text. IMO, general advice is, to avoid
any not required copies (even if you have reference-counted copy ctors
or whatever optimization on user class level) and unnecessary
conversions. Also setting up vectors/matrices and presetting the size
(and increasing later on) requires time and memory, if you have the data
in another structure anyway. The system here was mainly WinXP Visual
Studio 2003 with his native STL and a recent Redhat Linux on Opterons.

Micha  
-- 



Re: [sqlite] Performance database design

2006-03-22 Thread Micha Bieber
Eventually, I've got my lesson. Because it might be of some interest for
the beginner:

1)Use the associated sqlite3_bind_* variants for your data.
I did make a mistake in converting forth and back to strings beforehand.

2)It broke my program design a bit, but setting up large STL vector
based C++ string records (especially using push_back and stringstreams
for conversions) turned out to be a show stopper on some platforms. The
plan was, to feed them to sqlite_bind* later on (with the additional
performance drop caused by exclusively using sqlite3_bind_text mentioned
above). Setting up the structures took more time than the asynchronous
writing to the database.

This way my 40 minutes per 25M rows dropped to 8 minutes, something I'm
completely happy with.

I'm sure, many things are quite standard for the experts here, but
for the beginner it might be useful to know. I have seen a Wiki item,
dealing with performance. Would it be possible to give some emphasis to
the problem in the standard documentation too ?

Thank you for the software, Richard (et al ?).
I like the library a lot.

Micha  
-- 



[sqlite] Performance database design

2006-03-21 Thread Micha Bieber
Hallo list,

I'm relatively inexperienced when it comes to databases and SQL (but to
programming). I'm using sqlite's (recent version) C API called from a
C++ application.

I'm confronted with the following situation:

Ca. 2500-3000 objects (called 'entity') everyone with 3 properties (REAL
in sqlite notation) depending on 'positions' (x,y,z). (x,y,z) are
INTEGERS, the grid size might slightly vary between the entities.

The database in a first attempt looks like:

EntityTable

key  val1 val2 
1
2
.
.
2500

MatrixTable

key  key_to_entitytable   x  y  z  prop1 prop2 prop2
110  0  0. . .
.10  0  1. . .
.10  0  2. . .
...  .  .. . .
25*10^6  ..  .  .. . .


My problem is the second table. Appending all 25*10^6 rows takes 40
minutes on a PC with 3 GHz, 1GB memory and sufficient large harddisk.

1)
On a technical level, my usage of the sqlite API might be improper. So
I'll attach the relevant code encapsulating the appends :

The sql() functions utilizes more or less sqlite_exec + error handling,
but has no impact here.

table   ... name of the sqlite table
stmtmatrix  ... vector of rows ('BindRecord's' - 25 million for the mentioned 
case)

void Database::append(const QString table, const vectorBindRecord 
stmtmatrix,
  bool skipprimarykey /*=true*/)
{
  if (stmtmatrix.empty())
return;

  QString pstr(insert into );
  pstr += table +  values (;
  if (skipprimarykey)
pstr += NULL, ;
  pstr += ?;
  for (unsigned i=1; i!=stmtmatrix[0].values.size(); ++i)
  {
pstr  += ,?;
  }
  pstr += );

  int c = 0;
  sql(begin);
  sqlite3_stmt *stmt;
  if (sqlite3_prepare(
 db_, 
 pstr.ascii(),  // stmt
-1,
stmt,
 0
   )!= SQLITE_OK) 
  {
printf(\nCould not prepare statement.);
return;
  }
  for (unsigned i=0; i!=stmtmatrix.size(); ++i) // iterating rows
  { 
for (unsigned j = 0; j!=stmtmatrix[i].values.size(); ++j)
{
  int ERR = sqlite3_bind_text (
  stmt,
  j+1,  // Index of wildcard
  stmtmatrix[i].values[j].ascii(),
  stmtmatrix[i].values[j].length(),  // length of text
  SQLITE_STATIC
  );
  
if (ERR != SQLITE_OK) 
{
  printf(\nCould not prepare statement.);
} 
}
sqlite3_step(stmt); 
sqlite3_reset(stmt);
++c;
if (c==10)
{
  sql(commit);
  sql(begin);
  c=0;
}
  }
  sql(commit);  
  sqlite3_finalize(stmt);
}

Calling Database::.sql(PRAGMA synchronous = OFF)
before Database::append improves speed, but not beyond the 40 minutes
cited. Can someone elaborate on the code regarding performance ?

2)
Database design. Is this big table a good choice or has anyone
recommendations for better design. For example, splitting the big table
into smaller ones (assigned to a single entity) with ~10 rows and
later on building database requests utilizing joints ? Or something
completely different ?

Thank you,
Micha
--