Re: [sqlite] Request for ISO Week in strftime()

2017-05-17 Thread Olivier Mascia
In ISO the weeks are indeed from Monday (1) to Sunday (7) and all days between 
a Monday and Sunday belong to the same week.

The first week (1) of a year is the one containing the first Thursday of the 
year. Or said differently containing the 4th of January.

This implies that : week 1 can start in the year-1 and there can be a week 53 
some years.

-- 
Best regards, Meilleures salutations, Met vriendelijke groeten,  
Olivier Mascia (from mobile device), http://integral.software

> Le 18 mai 2017 à 07:31, John McMahon  a écrit :
> 
> Sorry, re-sending to list.
> 
> Point of Clarification: The ISO Week begins as day 1 on Monday and ends as 
> day 7 on Sunday, hump day (colloq.) is Thursday. There may be other 
> repercussions in terms of week counts if this has not been implemented 
> correctly. I haven't checked, I do not use this personally.
> 
> John
> 
> 
> 
> 
>> On 17/05/2017 19:07, no...@null.net wrote:
>> The current '%W' week substitution appears to be US-specific. I would
>> like to make a feature request for a '%V' (or similar) substitution
>> that inserts the ISO-8601 week number.
> 
> -- 
> Regards
>   John McMahon
>  li...@jspect.fastmail.fm
> 
> 
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQLite in memory

2017-05-17 Thread Paul
If by any chance you have access to Linux or alike, you can just mount a ramfs 
and move database file over there.
It is a usual file system that lives in RAM. This will 100% guarantee you that 
no disk access will be made by SQLite.


18 May 2017, 08:18:47, by "Gabriele Lanaro" :
 
>   Hi, I'm trying to assess if the performance of my application is dependent
> on disk access from sqlite.
> 
> To rule this out I wanted to make sure that the SQLite DB is completely
> accessed from memory and there are no disk accesses.
> 
> Is it possible to obtain this effect by using pragmas such as cache_size?
> 
> Another solution is to copy the existing db to a :memory: db but I'd like
> to achieve the same effect without doing so (because it will require
> substantial modification of the application). For the sake of argument,
> let's image that using :memory: db is not an option.
> 
> Also using a ramdisk is not an option because I don't have root access to
> the machine.
> 
> Thanks,
> 
> Gabriele
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQLite in memory

2017-05-17 Thread petern
From the SQLite shell (CLI), have you tried dot commands ".backup" to file
and ".restore" to a new :memory: DB?  That assumes a few things like access
to the filesystem and sufficient user memory quota to hold the disk version
of the DB.  Does that work?

The shell dot commands and their syntax is explained here:
https://sqlite.org/cli.html



On Wed, May 17, 2017 at 10:18 PM, Gabriele Lanaro  wrote:

> Hi, I'm trying to assess if the performance of my application is dependent
> on disk access from sqlite.
>
> To rule this out I wanted to make sure that the SQLite DB is completely
> accessed from memory and there are no disk accesses.
>
> Is it possible to obtain this effect by using pragmas such as cache_size?
>
> Another solution is to copy the existing db to a :memory: db but I'd like
> to achieve the same effect without doing so (because it will require
> substantial modification of the application). For the sake of argument,
> let's image that using :memory: db is not an option.
>
> Also using a ramdisk is not an option because I don't have root access to
> the machine.
>
> Thanks,
>
> Gabriele
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Request for ISO Week in strftime()

2017-05-17 Thread John McMahon

Sorry, re-sending to list.

Point of Clarification: The ISO Week begins as day 1 on Monday and ends 
as day 7 on Sunday, hump day (colloq.) is Thursday. There may be other 
repercussions in terms of week counts if this has not been implemented 
correctly. I haven't checked, I do not use this personally.


John




On 17/05/2017 19:07, no...@null.net wrote:

The current '%W' week substitution appears to be US-specific. I would
like to make a feature request for a '%V' (or similar) substitution
that inserts the ISO-8601 week number.



--
Regards
   John McMahon
  li...@jspect.fastmail.fm


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] SQLite in memory

2017-05-17 Thread Gabriele Lanaro
Hi, I'm trying to assess if the performance of my application is dependent
on disk access from sqlite.

To rule this out I wanted to make sure that the SQLite DB is completely
accessed from memory and there are no disk accesses.

Is it possible to obtain this effect by using pragmas such as cache_size?

Another solution is to copy the existing db to a :memory: db but I'd like
to achieve the same effect without doing so (because it will require
substantial modification of the application). For the sake of argument,
let's image that using :memory: db is not an option.

Also using a ramdisk is not an option because I don't have root access to
the machine.

Thanks,

Gabriele
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bulk load strategy

2017-05-17 Thread Joseph L. Casale
> -Original Message-
> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On
> Behalf Of Simon Slavin
> Sent: Wednesday, May 17, 2017 2:02 PM
> To: SQLite mailing list 
> Subject: Re: [sqlite] Bulk load strategy
 
> I forgot: once you’ve remade the indexes run ANALYZE.  That’s the only time
> you need to do it.  At that time you have typical data in the tables and 
> SQLite
> will be able to gather all the data it needs to figure out good strategies.

Thanks for all the assistance guys, its working within reasonable limits on the
hardware given.

The unique constraint on a distinguished name is more about data integrity.
I am associating data against that value and it doesn't make sense to have more
than one. So if an "add" comes along unexpectedly (instead of an "update"), the
best way to know something is awry is for everything to turn pear shaped.

Kinda makes it hard for bugs to go unnoticed:)

Much appreciated everyone,
jlc
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bulk load strategy

2017-05-17 Thread Simon Slavin

On 17 May 2017, at 5:05pm, Simon Slavin  wrote:

> Fastest way to do bulk inserts would be to delete all the indexes which don’t 
> play any part in identifying duplicates, then do the inserting, then remake 
> the indexes.

I forgot: once you’ve remade the indexes run ANALYZE.  That’s the only time you 
need to do it.  At that time you have typical data in the tables and SQLite 
will be able to gather all the data it needs to figure out good strategies.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bulk load strategy

2017-05-17 Thread David Raymond
The key point I was thinking of for keeping that index was that it was perfect 
for speeding up the foreign key check / subquery for this part. I wasn't 
thinking at all in terms of unique enforcement.

INSERT OR IGNORE INTO AdAttribute
  (Type, Value, AdObjectId)
  VALUES
  (@Type, @Value, (SELECT Id FROM AdObject WHERE DistinguishedName = 
@DistinguishedName));

But yeah, keeping track of that in your enveloping program is a option.

-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Olaf Schmidt
Sent: Wednesday, May 17, 2017 3:40 PM
To: sqlite-users@mailinglists.sqlite.org
Subject: Re: [sqlite] Bulk load strategy

Am 17.05.2017 um 19:08 schrieb David Raymond:

> The unique index on DistinguishedName though is what gets used for that sub 
> query of the insert, so most definitely keep that one index for the whole 
> load. (The others can be left out until the end though)
> 

I once had a similar scenario, and solved it with good speed -
by following Simons suggestion to Drop all indexes first -
and then I've "manually ensured Uniqueness" over a DB-
independent, normal HashList for fast "Exists-lookups".

After the import went through (with about 30 records/sec),
I've freed the HashList and recreated the Indexes on the DB.

Not sure though, how many unique "DistinguishedNames" Joseph
has to manage in his scenario (risking out-of-memory on the
Hash-Container).

Olaf

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bulk load strategy

2017-05-17 Thread Olaf Schmidt

Am 17.05.2017 um 19:08 schrieb David Raymond:


The unique index on DistinguishedName though is what gets used for that sub 
query of the insert, so most definitely keep that one index for the whole load. 
(The others can be left out until the end though)



I once had a similar scenario, and solved it with good speed -
by following Simons suggestion to Drop all indexes first -
and then I've "manually ensured Uniqueness" over a DB-
independent, normal HashList for fast "Exists-lookups".

After the import went through (with about 30 records/sec),
I've freed the HashList and recreated the Indexes on the DB.

Not sure though, how many unique "DistinguishedNames" Joseph
has to manage in his scenario (risking out-of-memory on the
Hash-Container).

Olaf

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bulk load strategy

2017-05-17 Thread Simon Slavin

On 17 May 2017, at 7:07pm, Joseph L. Casale  wrote:

> So I have one query which if I expect if I encounter will be painful:
> 
> UPDATE AdAttribute
> SET Value = @NewValue
>   WHERE Type = @Type
> AND Value = @Value;
> 
> I may pass member or memberOf to @type, without the indexes this will be 
> abysmal.
> I don't expect to see this often and I don't have data that requires it in my 
> large data set.

In this case your two separate indexes, one on Type and one on Value, will not 
provide much help.  The ideal index for this case is the combination one on 
(Type, Value).  Creating this instead of the two you listed, will dramatically 
speed up the UPDATE command, and reduce the INSERT time and size of the 
database.

In terms of the other things you mentioned, I see nothing obvious you’ve missed 
and I think you have enough information to proceed.

Simon.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bulk load strategy

2017-05-17 Thread Joseph L. Casale
> -Original Message-
> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On
> Behalf Of Simon Slavin
> Sent: Wednesday, May 17, 2017 10:05 AM
> To: SQLite mailing list 
> Subject: Re: [sqlite] Bulk load strategy

> I’m questioning the point of AdAttribute_idx_0 and AdAttribute_idx_1.  It’s
> rare to usefully index values without types, for instance.  Do you actually
> have a SELECT which uses that one ?  Wouldn’t it be more efficient to do
> 
>   CREATE INDEX AdAttribute_idx_tv ON AdAttribute (
>  Type, Value
>   );
> 
> ?  That’s assuming that even that one gets used at all, since it seems far 
> more
> likely that you’d use (AdObjectId,Type).

There were some instances where I need to search all values regardless of the 
type.
However, I see that approach could make changes I don't intend.

The use case was a moddn, however you may be right and I should constrain that
to types of "member" and "memberOf". Otherwise I could modify a free form text
field for which I have no authority over.

So I have one query which if I expect if I encounter will be painful:

UPDATE AdAttribute
 SET Value = @NewValue
   WHERE Type = @Type
 AND Value = @Value;

I may pass member or memberOf to @type, without the indexes this will be 
abysmal.
I don't expect to see this often and I don't have data that requires it in my 
large data set.

However good catch.

> The sub-select is killing you.  Since it’s identical for all the INSERT 
> commands I
> suggest that you do that first, and keep the results in memory as a lookup
> table, or a hashed table, or a dictionary, or whatever your preferred language
> does.  You can look up those values in RAM far more quickly than SQLite can
> do the required file handling.

It seems I provided some bad numbers, I passed -w instead of -l to `wc` when
providing figures, I have ~160k records. The application processed at roughly
constant speed and finished quickly.

Brilliant Simon and thank you everyone for the guidance.

jlc
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SAVEPOINT with multiple databases

2017-05-17 Thread Roman Fleysher
Thank you, David. Now it totally makes sense to me. I realize this is SQL not 
SQLite question. 

Thank you,

Roman


From: sqlite-users [sqlite-users-boun...@mailinglists.sqlite.org] on behalf of 
David Raymond [david.raym...@tomtom.com]
Sent: Wednesday, May 17, 2017 12:38 PM
To: SQLite mailing list
Subject: Re: [sqlite] SAVEPOINT with multiple databases

One other minor thing of note is that attaching and detaching can only happen 
outside of a transaction. So you can't add or remove attached databases in the 
middle of a transaction, and transaction commit/rollback, or savepoint 
release/rollback will never leave you with a different set of attached 
databases than before that statement.


-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Roman Fleysher
Sent: Wednesday, May 17, 2017 12:25 AM
To: SQLite mailing list
Subject: Re: [sqlite] SAVEPOINT with multiple databases

Oh,  now i understand. Thank you,  Simon.


Roman



 Original message 
From: Simon Slavin 
Date: 5/16/17 5:35 PM (GMT-05:00)
To: SQLite mailing list 
Subject: Re: [sqlite] SAVEPOINT with multiple databases


On 16 May 2017, at 10:09pm, Roman Fleysher  
wrote:

> I think I came to a point where I need to learn SAVEPOINTs.
>
> I am trying to understand documentation if creation and release of save 
> points covers all presently attached databases, that is those before save 
> point is created? Is attaching a database just a command that will sit on the 
> transaction stack as any other, INSERT/UPDATE/DELETE?

It doesn’t matter when a database is attached to a connection.  A savepoint is 
something which affects the current transaction for a connection, which is 
comprised of the INSERT/UPDATE/DELETE operations you mentioned.

One SQLite connection:
Has access to at least one database, maybe more.
Has none or one current transaction.

One transaction:
Has zero or more savepoints.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bulk load strategy

2017-05-17 Thread David Raymond
The unique index on DistinguishedName though is what gets used for that sub 
query of the insert, so most definitely keep that one index for the whole load. 
(The others can be left out until the end though)

Otherwise, as was mentioned, journal_mode = off, synchronous = off, a large 
cache_size, and running it all in 1 transaction are gonna be the things that 
help the most.

You shouldn't need to do intermediate analyze runs, the defaults should use the 
correct indexes.


-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Simon Slavin
Sent: Wednesday, May 17, 2017 12:05 PM
To: SQLite mailing list
Subject: Re: [sqlite] Bulk load strategy


So if you never insert duplicates on AdObject(DistinguishedName), DROP that 
index.  And definitely DROP all the others.  Then do your users.  The reCREATE 
the indexes.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SAVEPOINT with multiple databases

2017-05-17 Thread David Raymond
One other minor thing of note is that attaching and detaching can only happen 
outside of a transaction. So you can't add or remove attached databases in the 
middle of a transaction, and transaction commit/rollback, or savepoint 
release/rollback will never leave you with a different set of attached 
databases than before that statement.


-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Roman Fleysher
Sent: Wednesday, May 17, 2017 12:25 AM
To: SQLite mailing list
Subject: Re: [sqlite] SAVEPOINT with multiple databases

Oh,  now i understand. Thank you,  Simon.


Roman



 Original message 
From: Simon Slavin 
Date: 5/16/17 5:35 PM (GMT-05:00)
To: SQLite mailing list 
Subject: Re: [sqlite] SAVEPOINT with multiple databases


On 16 May 2017, at 10:09pm, Roman Fleysher  
wrote:

> I think I came to a point where I need to learn SAVEPOINTs.
>
> I am trying to understand documentation if creation and release of save 
> points covers all presently attached databases, that is those before save 
> point is created? Is attaching a database just a command that will sit on the 
> transaction stack as any other, INSERT/UPDATE/DELETE?

It doesn’t matter when a database is attached to a connection.  A savepoint is 
something which affects the current transaction for a connection, which is 
comprised of the INSERT/UPDATE/DELETE operations you mentioned.

One SQLite connection:
Has access to at least one database, maybe more.
Has none or one current transaction.

One transaction:
Has zero or more savepoints.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bulk load strategy

2017-05-17 Thread Simon Slavin

On 17 May 2017, at 4:06pm, Joseph L. Casale  wrote:

> CREATE TABLE AdObject (
>IdINTEGER PRIMARY KEY NOT NULL,
>DistinguishedName TEXTNOT NULL COLLATE NOCASE,
>SamAccountNameTEXTCOLLATE NOCASE
> );
> CREATE UNIQUE INDEX AdObject_idx_0 ON AdObject (
>DistinguishedName
> );
> CREATE INDEX AdObject_idx_1 ON AdObject (
>SamAccountName
> );
> 
> CREATE TABLE AdAttribute (
>Id   INTEGER PRIMARY KEY NOT NULL,
>Type TEXTNOT NULL COLLATE NOCASE,
>ValueTEXTNOT NULL COLLATE NOCASE,
>AdObjectId INTEGER NOT NULL REFERENCES AdObject ON DELETE CASCADE ON 
> UPDATE CASCADE
> );
> CREATE INDEX AdAttribute_idx_0 ON AdAttribute (
>Type
> );
> CREATE INDEX AdAttribute_idx_1 ON AdAttribute (
>Value
> );
> CREATE INDEX AdAttribute_idx_2 ON AdAttribute (
>AdObjectId
> );
> 
> The bulk of the inserts look like:
> INSERT INTO AdObject
>  (DistinguishedName, SamAccountName)
>  VALUES
>  (@DistinguishedName, @SamAccountName);
> 
> INSERT OR IGNORE INTO AdAttribute
>  (Type, Value, AdObjectId)
>  VALUES
>  (@Type, @Value, @AdObjectId);

Fastest way to do bulk inserts would be to delete all the indexes which don’t 
play any part in identifying duplicates, then do the inserting, then remake the 
indexes.

So if you never insert duplicates on AdObject(DistinguishedName), DROP that 
index.  And definitely DROP all the others.  Then do your users.  The reCREATE 
the indexes.

As for the inserts themselves, batching them up in transactions of 10,000 seems 
acceptably fast.  But depending on the amount of memory you have free 100,000 
may be faster.  Or maybe even 2,000.  You’ll have to try it out.

I’m questioning the point of AdAttribute_idx_0 and AdAttribute_idx_1.  It’s 
rare to usefully index values without types, for instance.  Do you actually 
have a SELECT which uses that one ?  Wouldn’t it be more efficient to do

CREATE INDEX AdAttribute_idx_tv ON AdAttribute (
   Type, Value
);

?  That’s assuming that even that one gets used at all, since it seems far more 
likely that you’d use (AdObjectId,Type).

> (just noticed that IGNORE in the second query which serves no purpose).

That won’t slow down SQLite much.  Don’t worry about it.

> Things grind to a halt when I start the following:
> 
> INSERT OR IGNORE INTO AdAttribute
>  (Type, Value, AdObjectId)
>  VALUES
>  (@Type, @Value, (SELECT Id FROM AdObject WHERE DistinguishedName = 
> @DistinguishedName));

The sub-select is killing you.  Since it’s identical for all the INSERT 
commands I suggest that you do that first, and keep the results in memory as a 
lookup table, or a hashed table, or a dictionary, or whatever your preferred 
language does.  You can look up those values in RAM far more quickly than 
SQLite can do the required file handling.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bulk load strategy

2017-05-17 Thread Joseph L. Casale
> -Original Message-
> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On
> Behalf Of Gerry Snyder
> Sent: Wednesday, May 17, 2017 9:14 AM
> To: SQLite mailing list 
> Subject: Re: [sqlite] Bulk load strategy
> 
> If the updates pertain just to the 150k rows immediately preceding them,
> could you put each 150k chunk into its own table, and then do a join when
> accessing the data? Or even a merge at that point? Could be a lot faster.

Hi Gerry,
The updates would refer to past entries, however I have no idea when and
how often they appear. The complicating factor is that future records in the
source data may reflect past changes introduced and so I cannot defer them.

I certainly can alter the strategy, I am just not clear on exactly what you 
suggest?

Thanks!
jlc
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bulk load strategy

2017-05-17 Thread Gerry Snyder
On Wed, May 17, 2017 at 3:52 AM, Joseph L. Casale  wrote:

> I am trying to bulk load about a million records each with ~20 related
> records
> into two tables. I am using WAL journal mode, synchronous is off and
> temp_store
> is memory. The source data is static and the database will only be used as
> a means
> to generate reporting and is not vital. I am deferring index creation to
> after the load.
> The load proceeds along quickly to about 150k records where I encounter
> statements
> which perform modifications to previous entries. The incoming data is
> structured
> this way and has relational dependencies so these modifications spread
> throughout
> affect subsequent inserts.
>
> In a scenario such as this, what is the recommended approach?
>
> Thanks,
> jlc
>


> If the updates pertain just to the 150k rows immediately preceding them,
> could you put each 150k chunk into its own table, and then do a join when
> accessing the data? Or even a merge at that point? Could be a lot faster.
>
> Gerry Snyder
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bulk load strategy

2017-05-17 Thread Joseph L. Casale
> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On
> Behalf Of Richard Hipp
> Sent: Wednesday, May 17, 2017 8:54 AM
> To: SQLite mailing list 
> Subject: Re: [sqlite] Bulk load strategy
> 
> Can you send ore details about your data and the updates and indexes
> you are using?

Sure, the database will be used to generate a myriad of custom reports based on
Active Directory data of specific  types. Some of the reports are not simple
in that they involve cross referencing attributes of one object such as 
sIDHistory
with attributes of another such as objectSid.

CREATE TABLE AdObject (
IdINTEGER PRIMARY KEY NOT NULL,
DistinguishedName TEXTNOT NULL COLLATE NOCASE,
SamAccountNameTEXTCOLLATE NOCASE
);
CREATE UNIQUE INDEX AdObject_idx_0 ON AdObject (
DistinguishedName
);
CREATE INDEX AdObject_idx_1 ON AdObject (
SamAccountName
);

CREATE TABLE AdAttribute (
Id   INTEGER PRIMARY KEY NOT NULL,
Type TEXTNOT NULL COLLATE NOCASE,
ValueTEXTNOT NULL COLLATE NOCASE,
AdObjectId INTEGER NOT NULL REFERENCES AdObject ON DELETE CASCADE ON UPDATE 
CASCADE
);
CREATE INDEX AdAttribute_idx_0 ON AdAttribute (
Type
);
CREATE INDEX AdAttribute_idx_1 ON AdAttribute (
Value
);
CREATE INDEX AdAttribute_idx_2 ON AdAttribute (
AdObjectId
);

The bulk of the inserts look like:
INSERT INTO AdObject
  (DistinguishedName, SamAccountName)
  VALUES
  (@DistinguishedName, @SamAccountName);

INSERT OR IGNORE INTO AdAttribute
  (Type, Value, AdObjectId)
  VALUES
  (@Type, @Value, @AdObjectId);

(just noticed that IGNORE in the second query which serves no purpose).

Things grind to a halt when I start the following:

INSERT OR IGNORE INTO AdAttribute
  (Type, Value, AdObjectId)
  VALUES
  (@Type, @Value, (SELECT Id FROM AdObject WHERE DistinguishedName = 
@DistinguishedName));

The IGNORE above is required as the input data may ask to modify attributes for 
which no record exists.

Thanks for the help,
jlc
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bulk load strategy

2017-05-17 Thread Richard Hipp
Can you send ore details about your data and the updates and indexes
you are using?

On 5/17/17, Joseph L. Casale  wrote:
>> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
>> On
>> Behalf Of Clemens Ladisch
>> Sent: Wednesday, May 17, 2017 6:04 AM
>> To: sqlite-users@mailinglists.sqlite.org
>> Subject: Re: [sqlite] Bulk load strategy
>>
>> Without an index, searching for a previous entry is likely to involve
>> a scan through the entire table.  It might be a better idea to have the
>> index available here, even with the additional cost of updating it.
>
> While that showed true, both approaches are still too slow. Maintaining the
> data in memory in order to facilitate the potential manipulation before
> persisting it far exceeds the workstations memory capacity of 12Gb so
> I need to come up with a new strategy.
>
> I tried adding ANALYZE statements periodically to update the indexes
> however it seemed not to matter, I also tried committing transactions
> before the ANALYZE at the same interval without any success.
>
> Anyone have any other suggestions?
>
> Thanks guys,
> jlc
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bulk load strategy

2017-05-17 Thread Joseph L. Casale
> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On
> Behalf Of Clemens Ladisch
> Sent: Wednesday, May 17, 2017 6:04 AM
> To: sqlite-users@mailinglists.sqlite.org
> Subject: Re: [sqlite] Bulk load strategy
> 
> Without an index, searching for a previous entry is likely to involve
> a scan through the entire table.  It might be a better idea to have the
> index available here, even with the additional cost of updating it.

While that showed true, both approaches are still too slow. Maintaining the
data in memory in order to facilitate the potential manipulation before
persisting it far exceeds the workstations memory capacity of 12Gb so
I need to come up with a new strategy.

I tried adding ANALYZE statements periodically to update the indexes
however it seemed not to matter, I also tried committing transactions
before the ANALYZE at the same interval without any success.

Anyone have any other suggestions?

Thanks guys,
jlc
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] sqlite3_create_function xFinal argument called when there's an error?

2017-05-17 Thread Clemens Ladisch
Nelson, Erik - 2 wrote:
> should any result be set in xFinal?  Like sqlite3_result_error?  Or the 
> previously-returned xStep error is sufficient?

The sqlite3_result_xxx() documentation documents what happens when you
call it multiple times (i.e., the later call overrides the earlier
value).  You call sqlite3_result_error() in xFinal if you want that to
happen.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] sqlite3_create_function xFinal argument called when there's an error?

2017-05-17 Thread Nelson, Erik - 2
> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On
> Behalf Of Clemens Ladisch
> Sent: Wednesday, May 17, 2017 2:36 AM
> To: sqlite-users@mailinglists.sqlite.org
> Subject: Re: [sqlite] sqlite3_create_function xFinal argument called when
> there's an error?
> 
> Nelson, Erik - 2 wrote:
> > for aggregate functions, is xFinal called if there's an error?
> 
> Yes; it's always called when SQLite cleans up the context.
> 
> 
>Clemens wrote:

Thanks!  In that case, should any result be set in xFinal?  Like 
sqlite3_result_error?  Or the previously-returned xStep error is sufficient?

--
This message, and any attachments, is for the intended recipient(s) only, may 
contain information that is privileged, confidential and/or proprietary and 
subject to important terms and conditions available at 
http://www.bankofamerica.com/emaildisclaimer.   If you are not the intended 
recipient, please delete this message.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Function sqlite3_prepare_v2 of C API does not work

2017-05-17 Thread Олег Пруц
Thanks for your responses, the original reporters confirmed that it is
possible to insert row in a table with parent colums when making them
UNIQUE (https://github.com/sqlitebrowser/sqlitebrowser/issues/463), so my
issue (https://github.com/sqlitebrowser/sqlitebrowser/issues/901) is
resolved too.

Regards,
Oleg Prutz


2016-12-22 4:38 GMT+03:00 Simon Slavin :

>
> On 21 Dec 2016, at 11:52pm, Олег Пруц  wrote:
>
> > I am contributing to DB Browser for SQLite (
> > https://github.com/sqlitebrowser/sqlitebrowser). It is written in C++
> and
> > it relies on C API.
> > We have a problem: when foreign_keys pragma is enabled, we cannot
> > use sqlite3_prepare_v2.
>
> I was about to suggest you try opening the same file in the SQLite
> command-line tool, but I see DRH has explained the problem.
>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bulk load strategy

2017-05-17 Thread Clemens Ladisch
Joseph L. Casale wrote:
> I am deferring index creation to after the load.
> The load proceeds along quickly to about 150k records where I encounter 
> statements
> which perform modifications to previous entries.

Without an index, searching for a previous entry is likely to involve
a scan through the entire table.  It might be a better idea to have the
index available here, even with the additional cost of updating it.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Bulk load strategy

2017-05-17 Thread Richard Hipp
On 5/17/17, Joseph L. Casale  wrote:
> I am trying to bulk load about a million records each with ~20 related
> records
> into two tables. I am using WAL journal mode, synchronous is off and
> temp_store
> is memory. The source data is static and the database will only be used as a
> means
> to generate reporting and is not vital. I am deferring index creation to
> after the load.
> The load proceeds along quickly to about 150k records where I encounter
> statements
> which perform modifications to previous entries. The incoming data is
> structured
> this way and has relational dependencies so these modifications spread
> throughout
> affect subsequent inserts.
>
> In a scenario such as this, what is the recommended approach?

I think I would set "PRAGMA journal_mode=OFF;" since you will not be
using ROLLBACK.  Put the entire bulk load inside a single transaction,
and make the cache as big as you can, depending on the amount of RAM
you have on your system.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Bulk load strategy

2017-05-17 Thread Joseph L. Casale
I am trying to bulk load about a million records each with ~20 related records
into two tables. I am using WAL journal mode, synchronous is off and temp_store
is memory. The source data is static and the database will only be used as a 
means
to generate reporting and is not vital. I am deferring index creation to after 
the load.
The load proceeds along quickly to about 150k records where I encounter 
statements
which perform modifications to previous entries. The incoming data is structured
this way and has relational dependencies so these modifications spread 
throughout
affect subsequent inserts.

In a scenario such as this, what is the recommended approach?

Thanks,
jlc
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Request for ISO Week in strftime()

2017-05-17 Thread nomad
The current '%W' week substitution appears to be US-specific. I would
like to make a feature request for a '%V' (or similar) substitution
that inserts the ISO-8601 week number.

-- 
Mark Lawrence
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users