[sqlite] Bug: Successfully committed transaction rolled back after power failure

2016-02-01 Thread bm.emai...@gmail.com
Maybe, for Linux, there could be a  Pragma to choose between 
fsync-directory-on-commit,
fsync-directory-on-close, or no fsync-directory.

Particularly if it is a documented property (not unintended bug) that Linux 
filesystem operates this way.

Best regards

Sent from my BlackBerry? smartphone on O2

-Original Message-
From: Rowan Worth 
Sender: sqlite-users-bounces at mailinglists.sqlite.orgDate: Tue, 2 Feb 2016 
00:20:55 
To: SQLite mailing list
Reply-To: SQLite mailing list 
Subject: Re: [sqlite] Bug: Successfully committed transaction rolled back
 after power failure

Hi Stephen,

On 1 February 2016 at 21:45, Stephen Chrzanowski 
wrote:
>
> SQLite is nothing more than part of a program run by the OS.  It completely
> relies on whatever the OS tells it.  If the OS tells it that things are OK,
> then that is all that can be done.  SQLite can't take on the responsibility
> of making sure that my taxes are done right, it won't ever wash my car (I'd
> be afraid the T would scratch the paint anyways) and it absolutely cannot
> defend against the lies an OS gives it.


Of course! SQLite is a remarkably reliable bit of software and I hold it in
high regard, but as you say software has its limits. I'm not one to expect
more of SQLite then it can reasonably provide.

Going back to the OS, I've seen instance, in both Windows and Linux based
> platform (RH and Debian) where file sizes do not change even though streams
> of data are being pumped into that particular file.


Stands to reason on linux if the directory hasn't been synced to disk yet.
The file's dirent is stored in a completely independent extent.


> I've seen situations
> where files have been deleted in Linux (Not allowed in Windows) and data
> still gets dumped into the now deleted file.


?
This is not an error, this is standard POSIX file semantics - if you have a
file descriptor referring to a file which gets deleted from disk, the file
descriptor is guaranteed to remain valid until it is closed. NFS even goes
to some lengths to make this work.

Is it the programs fault that
> the file is physically deleted but the program isn't aware?  Is it supposed
> to check every single time it writes a byte to see if the file exists, if
> the partition is healthy, and the root file system passes a FSCK at every
> commit?
>

:)


> SQLite doesn't know or care what sector the bytes
> live at on the drive physically.  It doesn't know about fragmentation, it
> doesn't know what size the clusters are, or anything that is important for
> file storage.


But SQLite *does* know how to interface with the file system. It's all very
well to say the OS/hardware sometimes lies, but in this specific instance
SQLite hasn't even asked the question.

As I indicated in the last paragraph of my mail, I'm not in favour of
fsync-directory-on-commit in the general case. But that's because I worry
about the performance impact, not because SQLite is doing its best and at
the mercy of the OS. I can't remember who in this thread suggested it but
fsync-directory-on-close seemed like a great compromise.

-Rowan
___
sqlite-users mailing list
sqlite-users at mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Pragma automatic_index and ROWID: which one is really responsible for automatic indexes?

2016-02-01 Thread Yannick Duchêne
On Thu, 28 Jan 2016 22:08:02 +0200
R Smith  wrote:

> I think you are misunderstanding the Pragma and the idea of automatic 
> indices. An automatic Index might be created on a table that doesn't 
> have an adequately assigned primary key. It might also be created during 
> a query (mostly SELECT or sub-SELECT) for which there is no useful Index 
> created by the table designer and the Query planner figures it will be 
> quicker to make an Index than to do table scans through the query. This 
> habit of creating indices during select queries can be forced to not 
> happen by setting the "PRAGMA automatic_index=0;", but this needs to 
> happen when you open the DB connection, or at a minimum, before you try 
> any query - not after the cursor is created,  by that time the index 
> might already be made. (This is why you are not seeing any speed 
> improvement).

Here is, from the documentation: https://www.sqlite.org/optoverview.html
At the very bottom, is said:
> Do not confuse automatic indexes with the internal indexes (having names like
> "sqlite_autoindex_table_N") that are sometimes created to implement a PRIMARY
> KEY constraint or UNIQUE constraint. The automatic indexes described here
> exist only for the duration of a single query, are never persisted to disk,
> and are only visible to a single database connection. Internal indexes are
> part of the implementation of PRIMARY KEY and UNIQUE constraints, are
> long-lasting and persisted to disk, and are visible to all database
> connections. The term "autoindex" appears in the names of internal indexes
> for legacy reasons and does not indicate that internal indexes and automatic
> indexes are related.

That's this legacy naming which causes confusion.


-- 
Yannick Duch?ne


[sqlite] Bug: Successfully committed transaction rolled back after power failure

2016-02-01 Thread Rowan Worth
On 1 February 2016 at 18:58, Simon Slavin  wrote:

>
> On 1 Feb 2016, at 9:23am, bm.email01 at gmail.com wrote:
>
> > ---
> > No, SQLite does not. On COMMIT it fsyncs the database file and unlinks
> the
> > journal[1], but does not fsync the directory.
> > ---
> >
> > Since that can cause the last transaction to be lost, despite Sqlite
> returning a 'Commit successful' code to the application, doesn't that mean
> that Sqlite is _not_ truly 'Durable' (per ACID)?
>
> 1) The fault here is the file system not SQLite.  This one particular file
> system has this strange requirement that you fsync the directory.  SQLite
> is ACID.  It's the file system that messes it up.  SQLite cannot defend
> against untrustworthy middleware -- sooner or later a programmer has to
> trust their hardware is doing what it says it's doing.
>

I take your point, but as Florian pointed out it's not just one file
system; its a somewhat well known quirk of POSIX fsync.
http://blog.httrack.com/blog/2013/11/15/everything-you-always-wanted-to-know-about-fsync/

It's a bit disingenuous to say "the file system requires a particular dance
to ensure an unlink persists on disk, but the dance is excessively complex
so we're going to skip it and blame any problems on the file system." The
fact remains that sqlite returned SQLITE_OK for a transaction that never
succeeded, because it didn't take the steps required for the i/o to hit
disk. However...


> 3) For both the above problems SQLite still neither loses data nor
> corrupts the database file.  If the program crashes the file is
> automatically closed.  When the SQLite API is used to reopen the file the
> unexpected close will be discovered and corrected for.
>

I agree with this 100%. SQLite absolutely provides consistency here, which
I would argue is the most important. In an actual power-loss scenario, what
is an application going to do between receiving SQLITE_OK and before losing
power that possibly matters? If the disk cache hasn't been flushed by the
time the power drops, then nothing else you've done in the meantime will
have hit disk either.

Note that the OP's situation is very unusual in that a controlled shutdown
is initiated after sqlite returns SQLITE_OK, but that shutdown *does not
sync the filesystem*! That is the only reason for the 30 second window. In
usual circumstances, my feeling is that the window between "power failing
in the middle of sqlite_exec" and "power failing after sqlite_exec has
returned but before the unlinked journal is synced to disk" is so small
that sqlite is within its rights to blur the two.

-Rowan


[sqlite] Bug: Successfully committed transaction rolled back after power failure

2016-02-01 Thread Stephen Chrzanowski
On Mon, Feb 1, 2016 at 11:20 AM, Rowan Worth  wrote:

>
> > I've seen situations
> > where files have been deleted in Linux (Not allowed in Windows) and data
> > still gets dumped into the now deleted file.
>
>
> ?
> This is not an error, this is standard POSIX file semantics - if you have a
> file descriptor referring to a file which gets deleted from disk, the file
> descriptor is guaranteed to remain valid until it is closed. NFS even goes
> to some lengths to make this work.
>
>
I didn't claim that this was an error, but a possibility to introduce
confusion and data corruption.

For instance, if the first DB file gets deleted, but the controlling
program still owns it, the file gets deleted, the OS still allows the first
instance to do what it needs to do with that file.  The only 'things' that
know that file exists is the OS and the single instance of that application.

Now say a second instance of the application starts up, recreates the
original file (New INODE) and starts creating journal/wal files, all of a
sudden you have two programs that have very real and very valid temp files
that could potentially be accessed by both processes and potentially get
your data into a really nasty state.  Is the application going to pass or
fail ACID because of external pressures?  Yanking the power cable is
considered an external pressure.

This has nothing to do with the thread, but just an illustration that your
software is at the mercy of the OS.  And if the OS isn't telling your
program the truth then it is the OS's fault, not the application or SQLite.


> Is it the programs fault that
> > the file is physically deleted but the program isn't aware?  Is it
> supposed
> > to check every single time it writes a byte to see if the file exists, if
> > the partition is healthy, and the root file system passes a FSCK at every
> > commit?
> >
>
> :)
>
>

>
> > SQLite doesn't know or care what sector the bytes
> > live at on the drive physically.  It doesn't know about fragmentation, it
> > doesn't know what size the clusters are, or anything that is important
> for
> > file storage.
>
>
> But SQLite *does* know how to interface with the file system. It's all very
> well to say the OS/hardware sometimes lies, but in this specific instance
> SQLite hasn't even asked the question.
>
>
That is all it has.  An interface.  Good word.  The information exchanged
is cut down to the basics.  It doesn't know what the OS REALLY did with
that fsync, or if those magnetic fields are aligned properly on the
platter, or stored in those transistor cells properly.

I'm not sure that I understand where you're going with "hasn't even asked
the question"?  Are you referring to pre-hard power off or post?  What I've
understood this thread as being "My application received a SQLITE_OK for a
commit, but, after a hard power loss by yanking the power cable, the reboot
comes back and my commit isn't there."  Its because the OS didn't write the
data when the software asked it to.  My point is how far can the software
go to make sure the OS is doing its job?  Which is why I wrote that
paragraph above where you smilied. ;)


> As I indicated in the last paragraph of my mail, I'm not in favour of
> fsync-directory-on-commit in the general case. But that's because I worry
> about the performance impact, not because SQLite is doing its best and at
> the mercy of the OS. I can't remember who in this thread suggested it but
> fsync-directory-on-close seemed like a great compromise.
>
> -Rowan
>

*ONLY* if configurable, and *ONLY* if not a default.  Most of my apps are
setup at highest risk because the data they contain doesn't matter and I
want performance vs safety.  There are applications I have written that are
data critical and I've got full sync on, my desk is illuminated by the HDD
light while I wait for that INSERT finishes.


[sqlite] Bug: Successfully committed transaction rolled back after power failure

2016-02-01 Thread Stephen Chrzanowski
This is steering off topic, but, here's my deal with computers as a whole.
Bluntly, I love them, I can't be without them, and I wouldn't change my
experiences with them for anything.  That doesn't mean that I implicitly
trust them.

My expectations are "low" because experience has taught me to keep them low
in the 30 years I've been dealing not only with software but hardware as
well going back to the Vic-20.  I was 8 when we got that machine, and by 9
I was already tearing it apart to "see how it worked".  I don't think my
parents ever found out though {smirks at the good ol' days}  I've seen
pretty funky things with computers, especially when it comes to flaky
electricity supplies.  I only point that out because this thread originated
on the statement "Pulled the plug".

At my old job as a repair tech at a mom-and-pop computer shop for 6 years,
I've had my hand on literally thousands of computers, and except for the
very few times when I was to build a few at the same time, they were all
unique.  I've had situations with customers machine who literally didn't
like the "taste" of electricity at the owners house, but would work
flawlessly on my work bench.  I set that machine, and one other I had
laying around up at the customers house, and hers failed to work while in
the OS while mine started no problem.  Possible reason is power sags and
power spikes affected the PSU.  I don't remember what I did with that
machine afterwards.  (I might have the notes still in my database).  When
you yank that power cord, and who knows which way that read/write head is
going to go, let alone what kind of condition the rest of your hardware is
going to be at.  You may think that yanking the power cord is the end all
and say all, but there are sparks of electricity that jump between the
connector plate and the plug in the PSU, and THAT is what causes the
unknowns.

Right now, at work, I have an external drive plugged in via ESATA so that I
can keep my own OS instead of sharing an OS with the other gents that use
that machine.  I'm looking at odd electrical problems with it right now,
probably because of the constant power switching over the past 3 years I've
been running it.  I have to let that drive spin up, warm up for a few
minutes, power it off, then back on, THEN turn the computer on.  I've
ALWAYS treated that drive and enclosure with the respect it needs as there
are no computer shops open 24/7.  I've never lost data, but, now I'm having
problems with it in a cold state.  So yes, again another reason to lower
that expectation bar.

I'm the kind of tech that doesn't use the PSU power switch to power off a
machine that is in a bad state unless I absolutely have to.  Soft power it
off with the power button for 4 seconds.  Only then would I use the PSU
power switch, or, yank the cable of the supply didn't have a switch.

I'm pretty firm with my statement of "You pull that plug, all bets are off".

All software depends on hardware.  You yank that plug, you have an unknown
state until you bring it up.  Power surges and sags are evil to computer
hardware.

On Mon, Feb 1, 2016 at 10:14 AM, Howard Chu  wrote:

Stephen Chrzanowski wrote:
>
> Your expectations are pretty low. On a properly configured Unix host,
> there's no reason for a powerfail to prevent a successful reboot. E.g., if
> you mount boot and root filesystems as read-only filesystems, they can
> never get corrupted. If you're using modern filesystems for your writable
> partitions (e.g., FSs with journaling) then there's also no reason for them
> to fail to come back online.
>
> So it just comes down to your application code being reliable.
>
> I should note that SQLightning has none of the problems being described in
> this thread - in its default mode, it is full-ACID and a powerfail cannot
> lose data or corrupt the database. And it does all this while being at
> least 30% faster on writes than vanilla SQLite.
>
> Yes, the OS could have bugs. Yes, the hardware could physically fail.
> That's pretty rare though; HDDs R/W heads auto-retract on powerfail so
> unless the entire mechanism actually jammed, there's no way for a powerfail
> to cause a head crash or any other destructive event.
>
> Bottom line - if your OS reboots successfully, there's no excuse for your
> database to not also come up successfully, fully intact.
>
> --
>   -- Howard Chu
>   CTO, Symas Corp.   http://www.symas.com
>   Director, Highland Sun http://highlandsun.com/hyc/
>   Chief Architect, OpenLDAP  http://www.openldap.org/project/
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


[sqlite] Bug: Successfully committed transaction rolled back after power failure

2016-02-01 Thread Jean-Christophe Deschamps
At 17:55 01/02/2016, you wrote:
>The above about implementation of RAID is good. There were battery
>backed up caching controllers 20 years ago. In the event of a power
>loss, the cached writes could be completed later.

I run such one RAID Areca controller with 8 server disks in RAID6. Even 
after trying hard to make writes fail, I never had a single incident in 
many years. I'm sure many other users still run such setups today, no 
need to go that far back in time.



[sqlite] Table entries disappearing for unknown reason.

2016-02-01 Thread Vinícius da Silva
Hello.

I'm developing a system for massive point cloud rendering and use SQLite in
order to store nodes of a spatial subdivision tree of the space where the
points lie in, aka Octree. For this, I have a very simple table with just a
key, which is an integer identifier of the node, aka Morton Code, and a
blob, which is the node itself. The system uses the table to store nodes
whenever a main memory budget is reached, in order to deal with point
clouds of arbitrary size.

My problem is that even if I can correctly insert the nodes into the table
(have checked that by querying the database after insertion), it seems that
they are deleted moments after, for an unknown reason. The database
requirements for the project are peak and correct performance in a
multithreaded environment, without any care about rollbacks, since all
transactions are done by the system itself and by now the database is
meaningless after system exit. To manage database connections I have the
SQLiteManager class and the database setup is the following one:

template< typename Point, typename MortonCode, typename OctreeNode >
void SQLiteManager< Point, MortonCode, OctreeNode >::init( const
string& dbFileName, const bool deleteDbFileFlag )
{
if( deleteDbFileFlag )
{
deleteFile( dbFileName );
}

checkReturnCode(
sqlite3_open_v2( dbFileName.c_str(), _db,
SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE | SQLITE_OPEN_NOMUTEX,
 NULL ),
SQLITE_OK
);
sqlite3_exec( m_db, "PRAGMA synchronous = OFF", NULL, NULL, NULL );
sqlite3_exec( m_db, "PRAGMA journal_mode = OFF", NULL, NULL, NULL );
sqlite3_busy_handler( m_db,
[]( void*, int )
{
// Always try again to make the database access
return 1;
},
NULL
);

createTables();
createStmts();
}

In addition to init(), there is an API to inserted nodes and to create
transactions. The serialization and statement step, reset and return code
checking are already proven correct by automated tests .

template< typename Point, typename MortonCode, typename OctreeNode >
void SQLiteManager< Point, MortonCode, OctreeNode >::insertNode( const
MortonCode& morton, const OctreeNode& node )
{
byte* serialization;
size_t blobSize = node.serialize(  );

checkReturnCode( sqlite3_bind_int64( m_nodeInsertion, 1,
morton.getBits() ), SQLITE_OK );
checkReturnCode( sqlite3_bind_blob( m_nodeInsertion, 2,
serialization, blobSize, SQLITE_STATIC ), SQLITE_OK );

safeStep( m_nodeInsertion );
safeReset( m_nodeInsertion );

delete[] serialization;
}

template< typename Point, typename MortonCode, typename OctreeNode >
void SQLiteManager< Point, MortonCode, OctreeNode >::beginTransaction()
{
safeStep( m_beginTransaction );
safeReset( m_beginTransaction );
}

template< typename Point, typename MortonCode, typename OctreeNode >
void SQLiteManager< Point, MortonCode, OctreeNode >::endTransaction()
{
safeStep( m_endTransaction );
safeReset( m_endTransaction );
}

The statements creation is done by this other method:

template< typename Point, typename MortonCode, typename OctreeNode >
void SQLiteManager< Point, MortonCode, OctreeNode >::createStmts()
{
safePrepare( "INSERT OR REPLACE INTO Nodes VALUES ( ?, ? );",
_nodeInsertion );
safePrepare( "SELECT Node FROM Nodes WHERE Morton = ?;",
_nodeQuery );
safePrepare( "SELECT Node FROM Nodes WHERE Morton BETWEEN ? AND
?;", _nodeIntervalQuery );
safePrepare( "SELECT * FROM Nodes;", _allIdNodesQuery );
safePrepare( "SELECT Morton, Node FROM Nodes WHERE Morton BETWEEN ?
AND ?;", _nodeIntervalIdQuery );
safePrepare( "DELETE FROM Nodes WHERE Morton BETWEEN ? AND ?;",
_nodeIntervalDeletion );
safePrepare( "BEGIN TRANSACTION", _beginTransaction );
safePrepare( "END TRANSACTION", _endTransaction );
}

This API is used by the HierarchyCreator class, which have an array of
SQLiteManager objects, each one used by a thread in the creation process.

// BEGIN PARALLEL WORKLIST PROCESSING.
#pragma omp parallel for
for( int i = 0; i < dispatchedThreads; ++i )
{
...

if( isReleasing )
{
m_dbs[ i ].beginTransaction();
}

   ... (Child nodes are processed,  creating its parent
inner as result) ...

releaseSiblings( inner.child(),
omp_get_thread_num(), m_octreeDim );

if( isReleasing )
{
m_dbs[ i ].endTransaction();
}
}


[sqlite] Bug: Successfully committed transaction rolled back after power failure

2016-02-01 Thread Howard Chu
Stephen Chrzanowski wrote:
> @Rowan;
>
> First off, whether the OS or SQLite is ACID or not, if you pull the plug on
> your hardware, all bets are off on whether it'll BOOT, let alone recover a
> single transaction.  I get that this could be a useful tool when doing
> disaster proofing, but, at that stage in the game of bulletproofing, you
> can't win every battle, and you're running into that at 100 miles an hour.

Your expectations are pretty low. On a properly configured Unix host, there's 
no reason for a powerfail to prevent a successful reboot. E.g., if you mount 
boot and root filesystems as read-only filesystems, they can never get 
corrupted. If you're using modern filesystems for your writable partitions 
(e.g., FSs with journaling) then there's also no reason for them to fail to 
come back online.

So it just comes down to your application code being reliable.

I should note that SQLightning has none of the problems being described in 
this thread - in its default mode, it is full-ACID and a powerfail cannot lose 
data or corrupt the database. And it does all this while being at least 30% 
faster on writes than vanilla SQLite.

Yes, the OS could have bugs. Yes, the hardware could physically fail. That's 
pretty rare though; HDDs R/W heads auto-retract on powerfail so unless the 
entire mechanism actually jammed, there's no way for a powerfail to cause a 
head crash or any other destructive event.

Bottom line - if your OS reboots successfully, there's no excuse for your 
database to not also come up successfully, fully intact.

-- 
   -- Howard Chu
   CTO, Symas Corp.   http://www.symas.com
   Director, Highland Sun http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/


[sqlite] IS a SQLite db of small size as good as reliable cache?

2016-02-01 Thread Wade, William
For cache eviction, below are you referring to writing dirty pages to disk?

If you are talking about cached reads, the evidence seems to point to many such 
pages remaining in some kind of cache.

I perform a recursive grep (Cygwin) on a large directory tree. It takes 4.5m. I 
wait five minutes and do it again, it takes 0.6m. It is possible that I am 
measuring something else (caching at the disk-driver level, or lower), but for 
the OP's question, that may be good enough.

Win7, 1tb Seagate scsi drive, 5500 rpm. System has 32gb of ram (c: drive is a 
256gb ssd).

Regards,
Bill

-Original Message-
From: Howard Chu [mailto:h...@symas.com]
Sent: Saturday, January 30, 2016 4:23 PM
To: SQLite mailing list
Subject: Re: [sqlite] IS a SQLite db of small size as good as reliable cache?

The Windows cache manager is quite aggressive at evicting cached pages from 
RAM. It used to be tunable back in Win2000, but none of those tuning knobs 
survived past WinXP. Generally, if you access some data, leave it for more than 
5-10 seconds, don't expect to be able to reference it again without incurring a 
hard page fault.

Note that the cache eviction runs quite frequently - once every 5 seconds or 
so, and evicts pages regardless of whether there's any memory pressure in the 
system. It's quite possibly the stupidest cache manager ever written.


**
This e-mail and any attachments thereto may contain confidential information 
and/or information protected by intellectual property rights for the exclusive 
attention of the intended addressees named above. If you have received this 
transmission in error, please immediately notify the sender by return e-mail 
and delete this message and its attachments. Unauthorized use, copying or 
further full or partial distribution of this e-mail or its contents is 
prohibited.
**


[sqlite] Documentation comment: VFS

2016-02-01 Thread R Smith


On 2016/02/01 1:55 PM, Simon Slavin wrote:
> Nowhere on the page
>
> 
>
> does it say what "VFS" stands for.  Please don't tell me...//

I'm going to ignore the request and tell you anyway: It clearly stands 
for "Variably F-Syncing Software".

And some people here would like to change the meaning to: "Very Feisty 
Synching"

Jokes aside, I agree with the notion that the VFS is what needs to 
understand the file-system peculiarities. SQLite engine should remain 
hardware-layer agnostic - though I'm not sure in this case whether it is 
the VFS that can even instigate the folder sync or it needs to originate 
from within the engine. I hope the former.



[sqlite] Customizing the location of the .sqlite_history

2016-02-01 Thread Rowan Worth
sqlite3() {
confs='.sqlite_history .sqliterc'
for c in $confs; do ln -s ~/.config/$c ~/$c; done
(sleep 10; cd; rm $confs)&
command sqlite3 "$@"
}

Alternately, simply create a new user with home directory matching your
~/.config, and su to that user to run sqlite3.

(no, neither of these suggestions are serious :P)
-Rowan

On 1 February 2016 at 13:46, Jes Slow  wrote:

> Hi - this is kind of a trivial request, but .sqlite_history is one of the
> relatively few files that I can't move into ~/.config folder (or wherever
> it fits).
>
> We know that globals can become messy as you scale, and up until the XDG
> standard applications treated the home folder as their global place to drop
> configuration files. When you have dozens of programs, that starts to get
> messy.
>
> Many applications do this by allowing the user to set an environment
> variable to customize the location, altho personally I would prefer another
> way since environment variables are also global. Hope you consider it.
> Thanks
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


[sqlite] Bug: Successfully committed transaction rolled back after power failure

2016-02-01 Thread Adam Devita
At the risk of repeating something mentioned last week on this thread.
One tactic to reduce/avoid the no-directory sync problem is to use WAL
mode. The commit in WAL is write to the WAL file, so the the directory
sync problem goes away.

If you want to be in paranoid mode, don't trust others. Why not use
the backup api before checkpointing and after the checkpoint has
succeeded, check that both dbs have the same state before deleting (or
archiving) the backup?

Another tactic to handle that hasn't been discussed (if memory serves
me) that I'm curious if the following would work to get around the
directory sync issue:
Separate Sqlite  telling your program that the transaction is done
from the program telling the user. Don't tell the *user* that the
transaction is done until you have confirmed the .journal file that
existed before your commit no longer exists after it, so a power off
rollback can't happen. Could the OS lie and say the .journal file has
been deleted only to have it re-appear if the power failure is at the
'wrong' time?


The above about implementation of RAID is good. There were battery
backed up caching controllers 20 years ago. In the event of a power
loss, the cached writes could be completed later.

regards,
Adam




On Mon, Feb 1, 2016 at 10:14 AM, Howard Chu  wrote:
> Stephen Chrzanowski wrote:
>>
>> @Rowan;
>>
>> First off, whether the OS or SQLite is ACID or not, if you pull the plug
>> on
>> your hardware, all bets are off on whether it'll BOOT, let alone recover a
>> single transaction.  I get that this could be a useful tool when doing
>> disaster proofing, but, at that stage in the game of bulletproofing, you
>> can't win every battle, and you're running into that at 100 miles an hour.
>
>
> Your expectations are pretty low. On a properly configured Unix host,
> there's no reason for a powerfail to prevent a successful reboot. E.g., if
> you mount boot and root filesystems as read-only filesystems, they can never
> get corrupted. If you're using modern filesystems for your writable
> partitions (e.g., FSs with journaling) then there's also no reason for them
> to fail to come back online.
>
> So it just comes down to your application code being reliable.
>
> I should note that SQLightning has none of the problems being described in
> this thread - in its default mode, it is full-ACID and a powerfail cannot
> lose data or corrupt the database. And it does all this while being at least
> 30% faster on writes than vanilla SQLite.
>
> Yes, the OS could have bugs. Yes, the hardware could physically fail. That's
> pretty rare though; HDDs R/W heads auto-retract on powerfail so unless the
> entire mechanism actually jammed, there's no way for a powerfail to cause a
> head crash or any other destructive event.
>
> Bottom line - if your OS reboots successfully, there's no excuse for your
> database to not also come up successfully, fully intact.
>
> --
>   -- Howard Chu
>   CTO, Symas Corp.   http://www.symas.com
>   Director, Highland Sun http://highlandsun.com/hyc/
>   Chief Architect, OpenLDAP  http://www.openldap.org/project/
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



-- 
--
VerifEye Technologies Inc.
151 Whitehall Dr. Unit 2
Markham, ON
L3R 9T1


[sqlite] Documentation comment: VFS

2016-02-01 Thread Simon Slavin
Nowhere on the page



does it say what "VFS" stands for.  Please don't tell me.  I'm pointing out a 
strangeness in the documentation, not asking a question.

Simon.


[sqlite] Bug: Successfully committed transaction rolled back after power failure

2016-02-01 Thread Simon Slavin

On 1 Feb 2016, at 11:24am, Rowan Worth  wrote:
> 
> I take your point, but as Florian pointed out it's not just one file
> system; its a somewhat well known quirk of POSIX fsync.
> http://blog.httrack.com/blog/2013/11/15/everything-you-always-wanted-to-know-about-fsync/
> 
> It's a bit disingenuous to say "the file system requires a particular dance
> to ensure an unlink persists on disk, but the dance is excessively complex
> so we're going to skip it and blame any problems on the file system." The
> fact remains that sqlite returned SQLITE_OK for a transaction that never
> succeeded, because it didn't take the steps required for the i/o to hit
> disk.

Well, that's a fair point.  The problem is that SQLite does not know it's 
running on POSIX.  It does, however, have different VFSes for different 
platforms.  So the logical place to put this would be in the VFS.  At least 
that's what it looks like to me but I don't know much about the programming 
that goes into SQLite.

Simon.


[sqlite] How change the tmp vacuum temporary files

2016-02-01 Thread Andrea Peri
Hi,
thx for your response.

Unfortunatelly the choice n. 1 (the PRAGMA setting) is deprecated
So it will no more available in new version of sqlite.
I should not use use to avoid to lost compatibility with new versions of sqlite.

Also in my scripts I should to hypotize that the environment variable
could not always be set, so the default path became the /var/tmp
instead of the /tmp.

A.



2016-02-01 10:46 GMT+01:00 Clemens Ladisch :
> Andrea Peri wrote:
>> After some searches, I found the trouble is due to the vacuum command
>> that is going to use
>> the /var/tmp folder for temporary files.
>>
>> I like to say to the sqlite to use instead the /tmp.
>> [...]
>> Unfortunatelly I do not found in the documentation no one alternative
>> to the deprecated temp_store_directory.
>
> This is OS dependent.
>
> In your case, SQLite's temporary directory is
> 1. whatever is set with the PRAGMA temp_store_directory command; or
> 2. whatever is set with the SQLITE_TMPDIR environment variable; or
> 3. whatever is set with the TMPDIR environment variable; or
> 4. /var/tmp; or
> 5. /usr/tmp; or
> 6. /tmp.
>
>
> Regards,
> Clemens
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



-- 
-
Andrea Peri
. . . . . . . . .
qwerty ?
-


[sqlite] Bug: Successfully committed transaction rolled back after power failure

2016-02-01 Thread Florian Weimer
On 01/25/2016 04:47 PM, Richard Hipp wrote:
> On 1/25/16, Matthias-Christian Ott  wrote:
>>
>> Does this mean that if I use SQLite SQLITE_EXTRA_DURABLE=0, PRAGMA
>> journal_mode=DELETE and PRAGMA synchronous=FULL, SQLite could loose a
>> transaction that it said to be committed depending on the VFS?
> 
> Sort of.  This appears to be true if you are running on QNX and you
> lose power (or do a hard reset) shortly after the transaction commits.
> It might be the case on other OSes/filesystems but it has never before
> been observed.

Both XFS and Ceph on Linux require the directory fsync operation to make
an unlink persistent.  (I did not run experiments, I asked people
familiar with these file systems.)

Please activate directory fsyncs for Linux, too.

Florian



[sqlite] How change the tmp vacuum temporary files

2016-02-01 Thread Simon Slavin

On 1 Feb 2016, at 10:46am, Andrea Peri  wrote:

> Also in my scripts I should to hypotize that the environment variable
> could not always be set, so the default path became the /var/tmp
> instead of the /tmp.

See item number 3 in Clemens' list.  You should set the TMPDIR environment 
variable to wherever you think your temporary files should be.

Simon.


[sqlite] Bug: Successfully committed transaction rolled back after power failure

2016-02-01 Thread Simon Slavin

On 1 Feb 2016, at 9:23am, bm.email01 at gmail.com wrote:

> ---
> No, SQLite does not. On COMMIT it fsyncs the database file and unlinks the
> journal[1], but does not fsync the directory.
> ---
> 
> Since that can cause the last transaction to be lost, despite Sqlite 
> returning a 'Commit successful' code to the application, doesn't that mean 
> that Sqlite is _not_ truly 'Durable' (per ACID)?

1) The fault here is the file system not SQLite.  This one particular file 
system has this strange requirement that you fsync the directory.  SQLite is 
ACID.  It's the file system that messes it up.  SQLite cannot defend against 
untrustworthy middleware -- sooner or later a programmer has to trust their 
hardware is doing what it says it's doing.

2) Even if this were fixed, with an fsync of the directory included in the VFS, 
your storage system lies to the OS.  So you still wouldn't get ACID behaviour 
because your storage system isn't ACID.  SQLite cannot defend against 
untrustworthy storage -- sooner or later a programmer has to trust their 
hardware is doing what it says it's doing.

3) For both the above problems SQLite still neither loses data nor corrupts the 
database file.  If the program crashes the file is automatically closed.  When 
the SQLite API is used to reopen the file the unexpected close will be 
discovered and corrected for.

Simon.


[sqlite] How change the tmp vacuum temporary files

2016-02-01 Thread Clemens Ladisch
Andrea Peri wrote:
> After some searches, I found the trouble is due to the vacuum command
> that is going to use
> the /var/tmp folder for temporary files.
>
> I like to say to the sqlite to use instead the /tmp.
> [...]
> Unfortunatelly I do not found in the documentation no one alternative
> to the deprecated temp_store_directory.

This is OS dependent.

In your case, SQLite's temporary directory is
1. whatever is set with the PRAGMA temp_store_directory command; or
2. whatever is set with the SQLITE_TMPDIR environment variable; or
3. whatever is set with the TMPDIR environment variable; or
4. /var/tmp; or
5. /usr/tmp; or
6. /tmp.


Regards,
Clemens


[sqlite] Bug: Successfully committed transaction rolled back after power failure

2016-02-01 Thread Rowan Worth
On 31 January 2016 at 03:56, James K. Lowden 
wrote:

> Surely SQLite does both -- fsync on file and directory -- as part of a
> commit. That's not in doubt, is it?
>

No, SQLite does not. On COMMIT it fsyncs the database file and unlinks the
journal[1], but does not fsync the directory. This is demonstrated by the
logs in Stefan's initial post, and indeed the purpose of his patch was to
introduce an fsync on the directory after unlinking the journal.

[1] with the default PRAGMA journal_mode=delete, obviously
-Rowan


[sqlite] Find out how many times does SQLite hit the disk?

2016-02-01 Thread Rowan Worth
On 31 January 2016 at 15:09, Yannick Duch?ne 
wrote:

> If it's memory mapped, it's less an efficiency issue,
>

Hm, can you elaborate on this assertion? I don't think I agree.

Lets say sqlite wants to access a page in the DB/journal. In the case of
normal file access this is a call to pread/ReadFile, and in the
memory-mapped case a call to memcpy.

Now, the data in question may or may not alreaby be in the OS's disk cache.
If it is pread/ReadFile/memcpy proceeds without delay. If the data is not
in the cache, pread/ReadFile blocks until the i/o is complete. Similarly,
memcpy will encounter a page fault and the process will block until the OS
completes the i/o required to fill the page in memory. I'll grant there's
an extra syscall per i/o in the normal file access mode, but this cost is
_vanishingly_ small compared to the time required to load data from disk.

Have I misunderstood the mechanism behind memory-mapped file access?
-Rowan


[sqlite] How change the tmp vacuum temporary files

2016-02-01 Thread Andrea Peri
Hi,

using a vacuum command, I have a disk full error.

After some searches, I found the trouble is due to the vacuum command
that is going to use
the /var/tmp folder for temporary files.

I like to say to the sqlite to use instead the /tmp.

To do this, I read the documentation and see the command

PRAGMA temp_store_directory

But this command is deprecated,
So I don't like to use it becasue it could be a problema in a near future.

Unfortunatelly I do not found in the documentation no one alternative
to the deprecated temp_store_directory.

Is this true or I a miss something in the documentation ?

Thanks,

Andrea Peri

-- 
-
Andrea Peri
. . . . . . . . .
qwerty ?
-


[sqlite] Bug: Successfully committed transaction rolled back after power failure

2016-02-01 Thread Scott Robison
On Mon, Feb 1, 2016 at 9:55 AM, Adam Devita  wrote:

> At the risk of repeating something mentioned last week on this thread.
> One tactic to reduce/avoid the no-directory sync problem is to use WAL
> mode. The commit in WAL is write to the WAL file, so the the directory
> sync problem goes away.
>

Further, if durability is important to your application, there are truncate
and persist options in addition to delete that might be more effective for
a particular use case.

Again, as I suggested last night (somewhat tongue in cheek) we could have a
paranoid journal mode that does persist, truncate and delete in that order.


> If you want to be in paranoid mode, don't trust others. Why not use
> the backup api before checkpointing and after the checkpoint has
> succeeded, check that both dbs have the same state before deleting (or
> archiving) the backup?
>

In this case you aren't necessarily ahead because if the information is
cached in RAM but not flushed to disk, you could (theoretically) get a 100%
match yet not have the bits of either file truly synced to disk.


> Another tactic to handle that hasn't been discussed (if memory serves
> me) that I'm curious if the following would work to get around the
> directory sync issue:
> Separate Sqlite  telling your program that the transaction is done
> from the program telling the user. Don't tell the *user* that the
> transaction is done until you have confirmed the .journal file that
> existed before your commit no longer exists after it, so a power off
> rollback can't happen. Could the OS lie and say the .journal file has
> been deleted only to have it re-appear if the power failure is at the
> 'wrong' time?
>

Isn't that the exact nature of the problem being discussed? Journal is
deleted, hard power loss occurs, and journal file still exists when the OS
next powers up and looks for the journal file?

-- 
Scott Robison


[sqlite] Bug: Successfully committed transaction rolled back after power failure

2016-02-01 Thread bm.emai...@gmail.com
---
No, SQLite does not. On COMMIT it fsyncs the database file and unlinks the
journal[1], but does not fsync the directory.
---

Since that can cause the last transaction to be lost, despite Sqlite returning 
a 'Commit successful' code to the application, doesn't that mean that Sqlite is 
_not_ truly 'Durable' (per ACID)?

Best regards



Sent from my BlackBerry? smartphone on O2

-Original Message-
From: Rowan Worth 
Sender: sqlite-users-bounces at mailinglists.sqlite.orgDate: Mon, 1 Feb 2016 
10:35:34 
To: SQLite mailing list
Reply-To: SQLite mailing list 
Subject: Re: [sqlite] Bug: Successfully committed transaction rolled back
 after power failure

On 31 January 2016 at 03:56, James K. Lowden 
wrote:

> Surely SQLite does both -- fsync on file and directory -- as part of a
> commit. That's not in doubt, is it?
>

No, SQLite does not. On COMMIT it fsyncs the database file and unlinks the
journal[1], but does not fsync the directory. This is demonstrated by the
logs in Stefan's initial post, and indeed the purpose of his patch was to
introduce an fsync on the directory after unlinking the journal.

[1] with the default PRAGMA journal_mode=delete, obviously
-Rowan
___
sqlite-users mailing list
sqlite-users at mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Wish List for 2016: High Level API for Object Oriented Interactive Languages

2016-02-01 Thread Gabor Grothendieck
On Mon, Feb 1, 2016 at 8:27 AM, Keith Medcalf  wrote:
>
>> > But you already have pandas.read_sql_query.  While that function
>> > isn't really what I'd call simple, the complexity afaict -- dates,
>> > floats, and chunks -- can be laid at Python's feet.
>>
>> I use R rather than python but the problem of dates is significant and
>> I assume the same problem as in python.  Consider this query:
>>
>>  select myDate + 1 from myTable
>>
>> Assume that myTable comes from an R data frame with a myDate column
>> that has R class of "Date".  Internally R stores the myDate column as
>> days since 1970-01-01 and since SQLite has no Date class it sends that
>> number to SQLite.  The select statement then adds 1 giving a number
>> representing the next day but when one gets it back to R how can R
>> know that that number is intended to represent a Date and so convert
>> it to R's Date class?  In fact it can't.   With databases that have a
>> Date type this is not a problem but it is a significant problem with
>> SQLite.
>
> What is the problem exactly?  Sounds like the R thingy is broken.  I do this 
> all the time using Python.  You simply have to compile the library with 
> column_metadata and then use it.  Sheesh.  Even pysqlite has done this just 
> peachyu fine for about a decade.
>

> There are some hackish workarounds.  For example, consider this
>> self-contained reproducible R code:
>
>> library(sqldf)  # load package and dependencies
>> myTable <- data.frame(myDate = as.Date("2000-01-31"))  # define myTable
>> sqldf("select myDate + 1 myDate from myTable")
>
>> sqldf will use the heuristic of assuming that the myDate in the output
>> has the same class as the myDate in the input (because they have the
>> same name) and so will convert the output myDate column to the R
>> "Date" class but the ability of such a heuristic to work is fairly
>> limited.
>
>> It is also possible to specify to sqldf the class of each output
>> column but this is tedious and puts SQLite at a disadvantage since it
>> is unnecessary if used with a different backend database that is able
>> to return a Date class column.
>
>> With numeric and character columns there is not much problem but as
>> soon as one gets to date and date/time columns then this impedence
>> mismatch appears and is one fo the main reasons an R user might decide
>> to use a different backend.
>
> There is no impedence mismatch.  Simply inadequate wattage by the person(s) 
> solving the problem.  As I said, this problem has been solved with SQLite and 
> Python for a decade.  So I would suggest the problem is that the wattage was 
> so low, the lights were completely out.
>

The impedence in the example is that on the sqlite side the type is
numeric and on the R side it is of Date class.  It is impossible to
know which sqlite numeric types are intended to represent dates and
which are intended to represent numbers so when they are sent back to
R there is no automatic assignment to Date class.  Either the
programmer has to specify it (which is tedious) or else one has to
rely on heuristics such as assuming that any output field having the
same name as input field must also have the same type (but such
heuristics do not cover all cases).

If you believe that python can know that a date is being returned from
sqlite without the programmer specifying it even though sqlite has no
date class, I suggest providing a self contained reproducible example
of python code to illustrate it as I did with R.


[sqlite] Bug: Successfully committed transaction rolled back after power failure

2016-02-01 Thread Stephen Chrzanowski
@Rowan;

First off, whether the OS or SQLite is ACID or not, if you pull the plug on
your hardware, all bets are off on whether it'll BOOT, let alone recover a
single transaction.  I get that this could be a useful tool when doing
disaster proofing, but, at that stage in the game of bulletproofing, you
can't win every battle, and you're running into that at 100 miles an hour.

Next, SQLite *CAN* tell the OS that it is done with its data, that the OS
can close file handles, and that the OS can flush the cache to the disk.
Whether or not the OS, drivers, or disk drive obeys the command is an
entirely different story.

Now for my spiel;

SQLite is nothing more than part of a program run by the OS.  It completely
relies on whatever the OS tells it.  If the OS tells it that things are OK,
then that is all that can be done.  SQLite can't take on the responsibility
of making sure that my taxes are done right, it won't ever wash my car (I'd
be afraid the T would scratch the paint anyways) and it absolutely cannot
defend against the lies an OS gives it.  Believe me, computers are
sometimes worse liars than humans are when it comes to internal stability.
Look at SMART and the number of times it does error correction.  Were you
aware that digital TVs do error correction when getting data from your PVR,
Satellite receiver, or cable TV due to electrical noise?  Looks all good to
you while you're watching the tube, but it is constantly correcting errors
and lying that things are OK.

SQLite doesn't manage your OS, it doesn't manage the firmware running on
the drives.  It doesn't care whether it runs over SATA, IDE, SCSI, USB,
Firewire, Ethernet, or SAN.  It relies on the OS to get whatever data it
asks for, in either a read or write situation, and depends entirely on the
OS to get the job done.  SQLite does not manage the sector by sector reads
and writes, it doesn't handle the physical writing of the bits to the
platter.  The OS handles that via its drivers, and if the drivers lie, or
the OS isn't resilient enough for your needs, find a different OS or file
system.

The problem can also persist beyond the OS.  The hard drives own firmware
can lie to the OS.  Error correction happens after it tries to write the
data.  Some drives will do the write attempt once, fail but still send back
the OK, try again, and THEN send a fail back at the next operation that may
not even be related to your program.

Going back to the OS, I've seen instance, in both Windows and Linux based
platform (RH and Debian) where file sizes do not change even though streams
of data are being pumped into that particular file.  I've seen situations
where files have been deleted in Linux (Not allowed in Windows) and data
still gets dumped into the now deleted file.  Is it the programs fault that
the file is physically deleted but the program isn't aware?  Is it supposed
to check every single time it writes a byte to see if the file exists, if
the partition is healthy, and the root file system passes a FSCK at every
commit?

So yes, you HAVE to blame the file system and/or OS because SQLite isn't
managing a file system.  It manages a stream of data packaged into
particular lengths of raw data we call pages, tells the OS that it needs to
read a certain set of bytes from a specific number of bytes from the
beginning of a file.  SQLite doesn't know or care what sector the bytes
live at on the drive physically.  It doesn't know about fragmentation, it
doesn't know what size the clusters are, or anything that is important for
file storage.

On Mon, Feb 1, 2016 at 6:53 AM, Simon Slavin  wrote:

>
> On 1 Feb 2016, at 11:24am, Rowan Worth  wrote:
> >
> > I take your point, but as Florian pointed out it's not just one file
> > system; its a somewhat well known quirk of POSIX fsync.
> >
> http://blog.httrack.com/blog/2013/11/15/everything-you-always-wanted-to-know-about-fsync/
> >
> > It's a bit disingenuous to say "the file system requires a particular
> dance
> > to ensure an unlink persists on disk, but the dance is excessively
> complex
> > so we're going to skip it and blame any problems on the file system." The
> > fact remains that sqlite returned SQLITE_OK for a transaction that never
> > succeeded, because it didn't take the steps required for the i/o to hit
> > disk.
>
> Well, that's a fair point.  The problem is that SQLite does not know it's
> running on POSIX.  It does, however, have different VFSes for different
> platforms.  So the logical place to put this would be in the VFS.  At least
> that's what it looks like to me but I don't know much about the programming
> that goes into SQLite.
>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


[sqlite] Using custom function seems to not work on Universal App Platform v. 3.10.2.0 (x86, x64 builds only)

2016-02-01 Thread Artur Król
Hi,

I am getting a weird behaviour when using Universal App Platform library v. 
3.10.2.0. The problem occurs when application is build for x86 and x64 only ? 
build for arm works good. 

Here is the case
I have a wrapper from pure SqLite dll to c# in my code (using DllImport 
attributes in c#).  I am trying to query custom function registered by 
sqlite3_create_function_v2 - register working well in all builds. Function is 
also correctly invoked in all builds (checked in debug mode by brakepoint) but 
SqLite ?step? into result works only for arm build ? for other builds crashes 
with .Net NullReferenceException.

I attached zipped test project (Visual Studio 2015 with UAP SDK is required) ? 
have to change build platforms only (x86,x64,arm) to get this case. I think it 
could be SqLite bug in x86 and x84 dlls.

Best regards,
Artur Kr?l


[sqlite] IS a SQLite db of small size as good as reliable cache?

2016-02-01 Thread Keith Medcalf

> > Any insight into what they were thinking?  Back when I used Windows
> > daily, it used to annoy me that every morning the machine had to warm
> > up again, to revive the state I'd left it in the night before.  In
> > NetBSD I learned that unused memory is unused, so why not use it?

The file caching algorithms in DOS and OS/2 were patented technology of IBM.  
Microsoft was not permitted to use them (that is why Microsoft *never* had any 
sort of disk cache that ever worked, even in DOS), plus Microsoft is a Low 
Wattage company.

OS/2 had IBM cache technology in it which worked properly.  However, when 
Microsoft broke the Joint Application Development Agreement with OS/2 New 
Technology 2.0 (which they renamed to just "NT") Microsoft had to remove all of 
IBMs code and write their own.  As I said previously, Microsoft is pretty low 
wattage when it comes to understanding how and why things work and how to make 
computers work efficiently.  After all, they do stupid shit like optimize for 
maximum CPU usage, and then put that in their documentation as a crowning 
achievement.







[sqlite] Wish List for 2016: High Level API for Object Oriented Interactive Languages

2016-02-01 Thread Keith Medcalf

> > But you already have pandas.read_sql_query.  While that function
> > isn't really what I'd call simple, the complexity afaict -- dates,
> > floats, and chunks -- can be laid at Python's feet.
> 
> I use R rather than python but the problem of dates is significant and
> I assume the same problem as in python.  Consider this query:
> 
>  select myDate + 1 from myTable
> 
> Assume that myTable comes from an R data frame with a myDate column
> that has R class of "Date".  Internally R stores the myDate column as
> days since 1970-01-01 and since SQLite has no Date class it sends that
> number to SQLite.  The select statement then adds 1 giving a number
> representing the next day but when one gets it back to R how can R
> know that that number is intended to represent a Date and so convert
> it to R's Date class?  In fact it can't.   With databases that have a
> Date type this is not a problem but it is a significant problem with
> SQLite.

What is the problem exactly?  Sounds like the R thingy is broken.  I do this 
all the time using Python.  You simply have to compile the library with 
column_metadata and then use it.  Sheesh.  Even pysqlite has done this just 
peachyu fine for about a decade.

> There are some hackish workarounds.  For example, consider this
> self-contained reproducible R code:

> library(sqldf)  # load package and dependencies
> myTable <- data.frame(myDate = as.Date("2000-01-31"))  # define myTable
> sqldf("select myDate + 1 myDate from myTable")

> sqldf will use the heuristic of assuming that the myDate in the output
> has the same class as the myDate in the input (because they have the
> same name) and so will convert the output myDate column to the R
> "Date" class but the ability of such a heuristic to work is fairly
> limited.

> It is also possible to specify to sqldf the class of each output
> column but this is tedious and puts SQLite at a disadvantage since it
> is unnecessary if used with a different backend database that is able
> to return a Date class column.

> With numeric and character columns there is not much problem but as
> soon as one gets to date and date/time columns then this impedence
> mismatch appears and is one fo the main reasons an R user might decide
> to use a different backend.

There is no impedence mismatch.  Simply inadequate wattage by the person(s) 
solving the problem.  As I said, this problem has been solved with SQLite and 
Python for a decade.  So I would suggest the problem is that the wattage was so 
low, the lights were completely out.






[sqlite] IS a SQLite db of small size as good as reliable cache?

2016-02-01 Thread Howard Chu
James K. Lowden wrote:
> On Sat, 30 Jan 2016 22:23:24 +
> Howard Chu  wrote:
>
>> Note that the cache eviction runs quite frequently - once every 5
>> seconds or so, and evicts pages regardless of whether there's any
>> memory pressure in the system. It's quite possibly the stupidest
>> cache manager ever written.
>
> Any insight into what they were thinking?  Back when I used Windows
> daily, it used to annoy me that every morning the machine had to warm
> up again, to revive the state I'd left it in the night before.  In
> NetBSD I learned that unused memory is unused, so why not use it?

As I understand it, the main rationale is reliability - they don't trust their 
OS to stay up longer than 3 seconds after any particular write operation.

There's a lot more information here
https://groups.google.com/forum/#!topic/comp.os.ms-windows.nt.misc/449tdNYPoX0
That's from 2005 and unfortunately the Windows kernel variables for cache 
tuning no longer exist.

A lot of the relevant info no longer exists on the original websites either, 
but I was able to pull one up from the web archive:

https://web.archive.org/web/20010825042328/http://www.sysinternals.com/ntw2k/source/cacheman.shtml

> I have a feeling that "fast app launching" is the reason, as though
> Windows users were excitedly punching the Start button with a
> stopwatch.  But maybe there's more to it than that?

It may well be a user-oriented philosophy. It is certainly not a 
developer-oriented approach. It was my frustration with slow build times on 
Windows that led me to investigate this in the first place.

https://bugzilla.mozilla.org/show_bug.cgi?id=294122#c69

A lengthier explanation of how it works is online here 
https://msdn.microsoft.com/en-us/library/bb742613.aspx but there's pretty much 
no information there that's actionable - aside from LargeSystemCache there's 
no tuning knobs left.

-- 
   -- Howard Chu
   CTO, Symas Corp.   http://www.symas.com
   Director, Highland Sun http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/