On Fri, Oct 8, 2010 at 7:14 AM, Michele Pradella
<michele.prade...@selea.com> wrote:
>  "science fiction?" was a rhetorically question. I'm only wondering
> about what is the best and fastest way to DELETE a lot of records from
> huge DB. I know and understand physical limit of data moving: anyway for
> now I'm trying to split the BIG DELETE in some smaller DELETE to spread
> the time used. It's the only way I can figure out at the moment.
>

Is a soft-delete faster? Then you could add a slow-moving delete
(mentioned earlier by Aldes Rossi, for example)
for the soft-deleted records.

Stephan

> Il 08/10/2010 15.55, Jay A. Kreibich ha scritto:
>> On Fri, Oct 08, 2010 at 09:09:09AM +0200, Michele Pradella scratched on the 
>> wall:
>>>    I was thinking this too, but I take this for last chance: my hope is I
>>> can delete 5 millions of records in few seconds, science fiction? :)
>>    Science fiction of the worst B-grade sort.
>>
>>    Think about the numbers.  You're talking about updating a significant
>>    chunk of a multi-gigabyte file.  The WAL file tells you the changes
>>    amount to ~600MB of writes.  That's a whole CDs worth of data.  These
>>    days that might not be much for storage, but it is still a lot of
>>    data to move around.  Even if your storage system has a continuous,
>>    sustained write ability of 20MB/sec, that's a half minute.  How fast
>>    can your disk copy 600MB worth of data?
>>
>>    But you're not just writing.  You're doing a lot of reads from all
>>    over the file in an attempt to figure out what to modify and write.
>>    Both the reads and the writes (the integration, at least) are
>>    scattered and small, so you're not going to get anywhere near the
>>    sustained performance levels.  10x slower would be extremely good.
>>
>>    Or think of it in more physical numbers... If you're using a single
>>    vanilla disk, it likely spins at 7200 RPMs.  If it takes five minutes
>>    to update 5,000,000 records, that's an average of almost 140 records
>>    per disk revolution.  That's pretty good, considering everything else
>>    that is going on!
>>
>>
>>
>>    The only possible way to manipulate that much data in a "few seconds"
>>    is to load up on RAM, get a real operating system, and throw the
>>    whole database into memory.  Or spend many, many, many thousands of
>>    dollars on a very wide disk array with a very large battery-backed
>>    cache and a huge pipe between your host and the array.
>>
>>    Big storage is cheap.  Fast storage is not.  Don't confuse the two.
>>
>>     -j
>>
>>
>
>
> --
> Selea s.r.l.
>
>
>        Michele Pradella R&D
>
>
>        SELEA s.r.l.
>
> Via Aldo Moro 69
> Italy - 46019 Cicognara (MN)
> Tel +39 0375 889091
> Fax +39 0375 889080
> *michele.prade...@selea.com* <mailto:michele.prade...@selea.com>
> *http://www.selea.com*
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
Stephan Wehner

-> http://stephan.sugarmotor.org (blog and homepage)
-> http://loggingit.com
-> http://www.thrackle.org
-> http://www.buckmaster.ca
-> http://www.trafficlife.com
-> http://stephansmap.org -- http://blog.stephansmap.org
-> http://twitter.com/stephanwehner / @stephanwehner
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to