On Thu, 28 Jan 2016 19:59:02 +0100
Dominique Devienne <ddevienne at gmail.com> wrote:

> Not sure to understand what you expect this pragma to do, but inserts
> typically don't involve automatic indexes, which are used only in queries
> (selects).

I though it was responsible for the `sqlite_autoindex_"table_name"_N` which 
gets generated and augments the DB size. That's the name and the documentation 
which makes me believe this. I can see these index in sqlitebrowser and at the 
file size.

> Your insert does have a query, but it fits the PK, so no automatic index is
> needed. (and you can likely use AUTOINCREMENT to avoid that query in the
> first place).

"count" is not a primary key, and AUTOINCREMENT may be used only on primary key.

It indeed does not need a primary key, a unique constraint is enough. Whether 
or not I define a PK or a unique constraint, it ends into the same speed and DB 
size. I just wanted to test if it differs when a PK is defined (it does not).


> The 2.4x size difference is kinda expected, since your PK includes your
> biggest column I suspect (i.e. value), so both the table, and its PK index
> most store it, doubling at least the memory, and with page waste, clearly
> can account for the difference.
> 
> And the time difference can be in-part at least attributed to the increased
> IO.

That was my conclusion too, and that's why I would like to prevent the creation 
of this extraneous index. I can achieve it using `WITHOUT ROWID`, and not with 
the pragma, and that's what I don't understand. So, I can achieve it, just that 
it's in a way I don't understand, which give me the feeling I'm using a trick 
and I don't like it. Unless ROWID is indeed what's responsible for the creation 
of these extraneous indexes??

> It seems unusual to use the HTML text in the PK. Are you sure you need it?

It's short texts (one to ten characters for an average), just element name, 
attribute name and attribute value, unique triplets with a count of their 
occurences. There are only a few attribute values which may be a bit long, the 
href of <a> elements. I must say I simplified my explanations, it's not really 
the index, it's a prior step which makes some statistics which are later used 
to decide what elements and attributes will be focused on to index the 
contents. The text the final index stores is reasonably short title-like texts.


-- 
Yannick Duch?ne

Reply via email to