Thank you for answering Mr. Hipp. The general problem doesn't have to do with counting the rows (it was there as a placeholder). I just want to merge 2 (and more) table/streams.

The real query in madIS looks like the following:

create table t as
select upper(c1), c2, lower(c3) from
(
select * from file('http://www.foo.com/list1.tsv.gz')
union all
select * from file('http://www.foo.com/list2.tsv.gz')
)
where c2!=c4;

"list1.tsv.gz" and "list2.tsv.gz" are two enormous streams which i would like to process and put into table "t".

I have gone to great lengths to make the virtual table "file" being fully streamed. So both of the network files (list1, list2) arrive from the network packet by packet, get decompressed without touching the disk and then are broken into multiple columns ('tsv' is assumed to mean tab separated).

I admire SQLite very much for its predictability. So for the above query i would expect from it to scan first over the first file (list1) and then over the next (list2), and row by row put them in table "t".

This assumption was so strong that i've been searching all over the "file" VT code for a *very* long time to find out the bug in it that caused it to grid to a halt my system whenever i executed above query ( / partition was filled ).

I have a request. If "union all" cannot be changed to not write on the hard disk when scanning just once over tables/streams. Could the /var/tmp position that it writes to, be changed to another directory? I prefer to keep a small root (/) partition and right now i'm unable to do any "union all" on anything that is bigger than the free space on it.

Thank you again,

lefteris.

On 26/10/2012 9:23 μμ, Richard Hipp wrote:


On Fri, Oct 26, 2012 at 2:16 PM, Eleytherios Stamatogiannakis
<est...@gmail.com <mailto:est...@gmail.com>> wrote:

    I have been observing the following freaky behaviour of SQLite. When
    i run:

    select count(*) from (select * from huge_table union all select *
    from huge_table);

    Sqlite starts writting in /var/tmp/ a file like:

      /var/tmp/etilqs___gblRd6vUPcx91Hl, the root partition of fills up
    and an error is raised.

    Why does SQLite 3.7.14.1 need to write at all when doing union all?
    It seems to me that there is no reason for doing so.


The only way SQLite knows to evaluate the query is to (1) compute the
UNION ALL into a temporary table then (2) scan the temporary table to
count the rows.  /var/tmp space is used to hold the temporary table.

Try instead:

SELECT (select count(*) from huge_table)+(select count(*) from huge_table);




    Best regards,

    lefteris.
    _________________________________________________
    sqlite-users mailing list
    sqlite-users@sqlite.org <mailto:sqlite-users@sqlite.org>
    http://sqlite.org:8080/cgi-__bin/mailman/listinfo/sqlite-__users
    <http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users>




--
D. Richard Hipp
d...@sqlite.org <mailto:d...@sqlite.org>

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to