Timo,

Here is an important quote from the Goggle document mentioned in the prior 
thread:

   "They [compactions] also drop deletion markers if there are no higher 
numbered levels that contain a file whose range overlaps the current key."

The quote is from here:  http://leveldb.googlecode.com/svn/trunk/doc/impl.html

I would guess that the 10,000+ files you have sitting at Level-5 contain your 
stale data and some stale tombstones.   Those 10,000+ files are preventing data 
discard at lower levels.  And Level-5 does not get compactions very often.  
There have been random discussions within Basho about creating targeted 
compactions and/or supporting data expiration in leveldb.  As of today, nothing 
has happened but talking.

So …

There is a blind deleting approach that you could consider, especially if you 
are running Riak 1.3 or more recent.  This approach is simply:

- stop the Riak vnode,
- look at creation date of files in the vnode's directory,
- manually delete the files older than your cutoff date,
- run Riak repair on the vnode so that leveldb can create a MANIFEST that 
matches the files remaining.

Otherwise …

You can continue to wait for the compactions to purge old data.  If you are 
running Riak 1.3 or more recent, you can look at the creation date of files in 
each vnode's sst_5 subdirectory.  Those dates will give you a feel for how 
often your highest numbered levels receive compactions.


No, I am not happy with this reply.  I will spend some time considering how 
this issue could be better addressed.

One question for you:  What is the nature of your "key"?  Is it random, date 
based (increasing), etc.?

Matthew




On Sep 25, 2013, at 8:35 AM, Timo Gatsonides <t...@me.com> wrote:

> 
> I've posted about this earlier, see 
> http://lists.basho.com/pipermail/riak-users_lists.basho.com/2013-September/013270.html
> 
> I would like to provide some more background information as I still see no 
> decrease in storage space used at all so far. I currently have a 6 node 
> cluster where I am using the multi-backend, the relevant part of app.config 
> is below. Smaller K/V items are stored in the default backend on a small but 
> fast disk (also leveldb), larger items ranging 1-6Mb per value are stored in 
> a separate LevelDB backend that is mounted on a 2TB SATA drive (side note: I 
> used the "filesystem" backend during development in the early Riak versions, 
> hence the <<"fs">> name). Using the bucket properties the backend is chosen 
> and also for the large items the n_val is set to 2. See the curl output below.
> 
>  %% Riak KV config
>  {riak_kv, [
>             %% Storage_backend specifies the Erlang module defining the 
> storage
>             %% mechanism that will be used on this node.
> 
>             {storage_backend, riak_kv_multi_backend},
>             {multi_backend_default, <<"eleveldb">>},
>             {multi_backend, [
>               {<<"eleveldb">>, riak_kv_eleveldb_backend, [
>                 {data_root, "/data/riak/leveldb"}
>               ]},
>               {<<"fs">>, riak_kv_eleveldb_backend, [
>                 {data_root, "/bigdata/riak/bigleveldb"}
>               ]}, 
>               {<<"cache">>, riak_kv_memory_backend, [
>                 {max_memory, 16}, %% 16Mb
>                               {ttl, 86400} %% 1 Day in seconds
>               ]}
>             ]},
> 
> # curl http://localhost:8098/riak/avis
> {"props":{"allow_mult":false,"backend":"fs","big_vclock":50,"chash_keyfun":{"mod":"riak_core_util","fun":"chash_std_keyfun"},"dw":"quorum","last_write_wins":false,"linkfun":{"mod":"riak_kv_wm_link_walker","fun":"mapreduce_linkfun"},"n_val":2,"name":"avis","old_vclock":86400,"postcommit":[],"precommit":[],"r":1,"rw":"quorum","small_vclock":10,"w":"quorum","young_vclock":20}}
> 
> In the last two weeks I have deleted over 400,000 of large values (stored in 
> /bigdata/riak/bigleveldb), placing only small meta-information in there (in 
> Ruby: robject.raw_data = '', robject.meta['info']='info'). So far I have seen 
> absolutely no decrease in data usage at all. Since it is a production system 
> new data is also coming in every day and I've deleted many months of older 
> (>1 year) data in the last two weeks. I do see a few "Compacting" lines in 
> the LevelDB log files (see logging below) but the storage usage is only going 
> up.
> 
> I got some pointers from Matthew (from Basho) and Jens, however I'm still 
> stuck. Maybe I should wait even longer but my disks are filling up so I'll 
> have to expand the cluster or install larger 4TB drives in all the nodes (one 
> by one, copying the data of course).
> 
> Can anyone, maybe especially Matthew, tell me if I need to be more patient or 
> that there is something I can actively do about this situation?
> 
> Thanks,
> Timo
> 
> p.s. tail of LevelDB logs: # tail /bigdata/riak/bigleveldb/*/LOG
> ==> 
> /bigdata/riak/bigleveldb/1027618338748291114361965898003636498195577569280/LOG
>  <==
> 2013/09/24-23:36:05.257431 2b0980a4a940 Delete type=2 #103231
> 2013/09/24-23:36:05.323310 2b0980a4a940 Delete type=2 #103233
> 2013/09/24-23:36:05.504968 2b0980a4a940 Delete type=2 #103821
> 2013/09/24-23:36:05.949831 2b0980a4a940 Compacting 1@3 + 1@4 files
> 2013/09/24-23:36:16.909783 2b0980a4a940 Generated table #104017: 155 keys, 
> 420484068 bytes
> 2013/09/24-23:36:21.798106 2b0980a4a940 Generated table #104018: 59 keys, 
> 111461278 bytes
> 2013/09/24-23:36:21.798138 2b0980a4a940 Compacted 1@3 + 1@4 files => 
> 531945346 bytes
> 2013/09/24-23:36:21.799208 2b0980a4a940 compacted to: files[ 0 5 47 293 1253 
> 14319 0 ]
> 2013/09/24-23:36:21.820027 2b0980a4a940 Delete type=2 #103929
> 2013/09/24-23:36:22.846278 2b0980a4a940 Delete type=2 #103183
> 
> ==> 
> /bigdata/riak/bigleveldb/1141798154164767904846628775559596109106197299200/LOG
>  <==
> 2013/09/25-07:06:51.920881 2b097acfe940 Delete type=2 #112230
> 2013/09/25-07:06:52.013155 2b097acfe940 Compacting 1@3 + 3@4 files
> 2013/09/25-07:07:06.047538 2b097acfe940 Generated table #112277: 197 keys, 
> 422634227 bytes
> 2013/09/25-07:07:11.122818 2b097acfe940 Generated table #112278: 40 keys, 
> 121542577 bytes
> 2013/09/25-07:07:11.122847 2b097acfe940 Compacted 1@3 + 3@4 files => 
> 544176804 bytes
> 2013/09/25-07:07:11.123944 2b097acfe940 compacted to: files[ 0 7 51 282 1336 
> 13911 0 ]
> 2013/09/25-07:07:11.148538 2b097acfe940 Delete type=2 #110275
> 2013/09/25-07:07:13.149601 2b097acfe940 Delete type=2 #107642
> 2013/09/25-07:07:13.622819 2b097acfe940 Delete type=2 #107609
> 2013/09/25-07:07:14.179580 2b097acfe940 Delete type=2 #107611
> 
> ==> 
> /bigdata/riak/bigleveldb/114179815416476790484662877555959610910619729920/LOG 
> <==
> 2013/09/24-18:32:50.114531 2b0980a4a940 Delete type=2 #93671
> 2013/09/24-18:32:50.303141 2b0980a4a940 Delete type=2 #101508
> 2013/09/24-18:32:50.487194 2b0980a4a940 Delete type=2 #89442
> 2013/09/24-18:32:50.915524 2b0980a4a940 Delete type=2 #89773
> 2013/09/24-23:33:32.698558 2b098532c940 Level-0 table #108357: started
> 2013/09/24-23:33:33.064892 2b098532c940 Level-0 table #108357: 57458722 
> bytes, 331 keys OK
> 2013/09/24-23:33:33.657453 2b098532c940 Delete type=0 #108326
> 2013/09/25-06:37:42.526121 2b098532c940 Level-0 table #108359: started
> 2013/09/25-06:37:42.935994 2b098532c940 Level-0 table #108359: 55931165 
> bytes, 412 keys OK
> 2013/09/25-06:37:43.447568 2b098532c940 Delete type=0 #108356
> 
> ==> 
> /bigdata/riak/bigleveldb/1255977969581244695331291653115555720016817029120/LOG
>  <==
> 2013/09/24-19:25:02.361168 2b0982b28940 Delete type=2 #102679
> 2013/09/24-21:57:44.215287 2b098492b940 Level-0 table #109548: started
> 2013/09/24-21:57:44.560724 2b098492b940 Level-0 table #109548: 44681848 
> bytes, 224 keys OK
> 2013/09/24-21:57:45.232747 2b098492b940 Delete type=0 #109512
> 2013/09/25-00:14:08.894855 2b098492b940 Level-0 table #109550: started
> 2013/09/25-00:14:09.114181 2b098492b940 Level-0 table #109550: 44591929 
> bytes, 203 keys OK
> 2013/09/25-00:14:09.714129 2b098492b940 Delete type=0 #109547
> 2013/09/25-05:56:00.394497 2b098492b940 Level-0 table #109552: started
> 2013/09/25-05:56:00.750631 2b098492b940 Level-0 table #109552: 46178500 
> bytes, 368 keys OK
> 2013/09/25-05:56:01.198618 2b098492b940 Delete type=0 #109549
> 
> ==> 
> /bigdata/riak/bigleveldb/1370157784997721485815954530671515330927436759040/LOG
>  <==
> 2013/09/24-19:06:38.592678 2b0991618940 Generated table #109843: 197 keys, 
> 248701376 bytes
> 2013/09/24-19:06:38.592698 2b0991618940 Compacted 1@2 + 3@3 files => 
> 395861477 bytes
> 2013/09/24-19:06:38.593401 2b0991618940 compacted to: files[ 0 7 50 285 1396 
> 14138 0 ]
> 2013/09/24-23:56:59.835858 2b099341b940 Level-0 table #109845: started
> 2013/09/24-23:57:00.120522 2b099341b940 Level-0 table #109845: 48290387 
> bytes, 253 keys OK
> 2013/09/24-23:57:00.773145 2b099341b940 Delete type=2 #109294
> 2013/09/24-23:57:01.322884 2b099341b940 Delete type=2 #109777
> 2013/09/24-23:57:01.407742 2b099341b940 Delete type=2 #109295
> 2013/09/24-23:57:02.185318 2b099341b940 Delete type=0 #109809
> 2013/09/24-23:57:02.592206 2b099341b940 Delete type=2 #109293
> 
> ==> 
> /bigdata/riak/bigleveldb/228359630832953580969325755111919221821239459840/LOG 
> <==
> 2013/09/24-19:57:54.841673 2b0991618940 Compacted 1@3 + 1@4 files => 
> 379994890 bytes
> 2013/09/24-19:57:54.842748 2b0991618940 compacted to: files[ 0 8 51 270 1221 
> 13846 0 ]
> 2013/09/24-19:57:54.866534 2b0991618940 Delete type=2 #101969
> 2013/09/24-19:57:58.056087 2b0991618940 Delete type=2 #106525
> 2013/09/24-23:14:37.772560 2b099341b940 Level-0 table #106769: started
> 2013/09/24-23:14:38.158647 2b099341b940 Level-0 table #106769: 49340840 
> bytes, 164 keys OK
> 2013/09/24-23:14:38.563483 2b099341b940 Delete type=0 #106741
> 2013/09/25-06:44:41.739209 2b099341b940 Level-0 table #106771: started
> 2013/09/25-06:44:41.976788 2b099341b940 Level-0 table #106771: 51705699 
> bytes, 291 keys OK
> 2013/09/25-06:44:42.441905 2b099341b940 Delete type=0 #106768
> 
> ==> 
> /bigdata/riak/bigleveldb/342539446249430371453988632667878832731859189760/LOG 
> <==
> 2013/09/25-00:15:07.250766 2b0980a4a940 Generated table #119499: 254 keys, 
> 248480802 bytes
> 2013/09/25-00:15:07.250808 2b0980a4a940 Compacted 1@2 + 3@3 files => 
> 269219364 bytes
> 2013/09/25-00:15:07.251918 2b0980a4a940 compacted to: files[ 0 7 43 270 1381 
> 14278 0 ]
> 2013/09/25-07:00:41.323223 2b098532c940 Level-0 table #119501: started
> 2013/09/25-07:00:41.591647 2b098532c940 Level-0 table #119501: 46957115 
> bytes, 497 keys OK
> 2013/09/25-07:00:42.052639 2b098532c940 Delete type=2 #119141
> 2013/09/25-07:00:42.943180 2b098532c940 Delete type=2 #119139
> 2013/09/25-07:00:43.000400 2b098532c940 Delete type=0 #119449
> 2013/09/25-07:00:43.331502 2b098532c940 Delete type=2 #119414
> 2013/09/25-07:00:43.412299 2b098532c940 Delete type=2 #119140
> 
> ==> 
> /bigdata/riak/bigleveldb/570899077082383952423314387779798054553098649600/LOG 
> <==
> 2013/09/24-18:12:10.641006 2b0991618940 Delete type=2 #92945
> 2013/09/24-19:46:46.133549 2b099341b940 Level-0 table #110509: started
> 2013/09/24-19:46:46.334122 2b099341b940 Level-0 table #110509: 34692200 
> bytes, 120 keys OK
> 2013/09/24-19:46:46.703185 2b099341b940 Delete type=0 #110491
> 2013/09/24-23:14:05.269672 2b099341b940 Level-0 table #110511: started
> 2013/09/24-23:14:05.523896 2b099341b940 Level-0 table #110511: 36141859 
> bytes, 256 keys OK
> 2013/09/24-23:14:05.871651 2b099341b940 Delete type=0 #110508
> 2013/09/25-05:59:12.975178 2b099341b940 Level-0 table #110513: started
> 2013/09/25-05:59:13.220227 2b099341b940 Level-0 table #110513: 34597843 
> bytes, 442 keys OK
> 2013/09/25-05:59:13.652574 2b099341b940 Delete type=0 #110510
> 
> ==> 
> /bigdata/riak/bigleveldb/685078892498860742907977265335757665463718379520/LOG 
> <==
> 2013/09/24-22:00:19.553978 2b097d7da940 Compacted 1@2 + 4@3 files => 
> 472157173 bytes
> 2013/09/24-22:00:19.554663 2b097d7da940 compacted to: files[ 0 8 47 261 1152 
> 13977 0 ]
> 2013/09/24-22:00:19.565408 2b097d7da940 Delete type=2 #117373
> 2013/09/24-22:00:19.731409 2b097d7da940 Delete type=2 #121117
> 2013/09/24-22:00:20.466842 2b097d7da940 Delete type=2 #121787
> 2013/09/24-22:00:20.623605 2b097d7da940 Delete type=2 #121119
> 2013/09/24-22:00:21.318316 2b097d7da940 Delete type=2 #121495
> 2013/09/25-03:08:09.339865 2b098de02940 Level-0 table #121950: started
> 2013/09/25-03:08:09.661629 2b098de02940 Level-0 table #121950: 51046687 
> bytes, 333 keys OK
> 2013/09/25-03:08:10.297044 2b098de02940 Delete type=0 #121920
> 
> ==> 
> /bigdata/riak/bigleveldb/799258707915337533392640142891717276374338109440/LOG 
> <==
> 2013/09/24-20:19:02.314812 2b097acfe940 Delete type=2 #111932
> 2013/09/24-20:19:02.740509 2b097acfe940 Delete type=2 #112083
> 2013/09/24-20:19:02.763467 2b097acfe940 Delete type=2 #112022
> 2013/09/24-20:19:03.012092 2b097acfe940 Delete type=2 #112021
> 2013/09/25-00:07:09.219094 2b097cd98940 Level-0 table #112092: started
> 2013/09/25-00:07:09.550132 2b097cd98940 Level-0 table #112092: 60145463 
> bytes, 350 keys OK
> 2013/09/25-00:07:10.245902 2b097cd98940 Delete type=0 #112073
> 2013/09/25-05:53:47.104078 2b097cd98940 Level-0 table #112094: started
> 2013/09/25-05:53:47.556040 2b097cd98940 Level-0 table #112094: 60746639 
> bytes, 451 keys OK
> 2013/09/25-05:53:48.120232 2b097cd98940 Delete type=0 #112091
> 
> ==> 
> /bigdata/riak/bigleveldb/913438523331814323877303020447676887284957839360/LOG 
> <==
> 2013/09/25-06:16:03.591159 2b097d7da940 Delete type=2 #114611
> 2013/09/25-06:16:03.611863 2b097d7da940 Compacting 1@3 + 2@4 files
> 2013/09/25-06:16:11.030322 2b097d7da940 Generated table #114622: 73 keys, 
> 217984826 bytes
> 2013/09/25-06:16:11.042952 2b097d7da940 Generated table #114623: 98 keys, 
> 33748 bytes
> 2013/09/25-06:16:18.991480 2b097d7da940 Generated table #114624: 147 keys, 
> 251249017 bytes
> 2013/09/25-06:16:18.991508 2b097d7da940 Compacted 1@3 + 2@4 files => 
> 469267591 bytes
> 2013/09/25-06:16:18.993797 2b097d7da940 compacted to: files[ 0 9 38 266 1161 
> 16339 0 ]
> 2013/09/25-06:16:19.019274 2b097d7da940 Delete type=2 #92924
> 2013/09/25-06:16:20.779619 2b097d7da940 Delete type=2 #109438
> 2013/09/25-06:16:21.368453 2b097d7da940 Delete type=2 #114425
> 
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to