On 07/21/2013 04:54 PM, Russell Brown wrote:

On 21 Jul 2013, at 14:20, Siraaj Khandkar <[email protected]> wrote:

On 07/21/2013 07:24 AM, Russell Brown wrote:> Hi,

On 21 Jul 2013, at 02:09, Siraaj Khandkar <[email protected]> wrote:

I (sequentially) made 146204 inserts of unique objects to a single
bucket.  Several secondary indices (most with unique values) were set
for each object, one of which was "bucket" = BucketName (to use 2i
for listing all keys).

There is a special $bucket index for this already, please see the docs
here http://docs.basho.com/riak/latest/dev/using/2i/


Yeah... I stumbled on that piece of info in another doc about two days
ago - made me feel both stupid and validated :)

However, it doesn't seem to work for me - I always get: {ok,{keys,[]}}

Curious. How do you make the 2i query to the $bucket index?

Just as bellow, but with "$bucket" instead of "bucket":

Index = {binary_index, "$bucket"},
riakc_pb_socket:get_index(PID, Bucket, Index, Bucket).




6 of the objects appear to have been lost - they're consistently not
found by GETs (by key) and are not found by 2i queries to the indices
with unique values.

Oh. Erm. Have you deleted some keys? 2i is essentially an r=1 query.


Sort-of. This was a second instance of this batch insertion (a slightly
extended set of keys), the first one was deleted ~6 hours prior to
executing the second one.

At the end of the deletion there _were_ some tombstones left. Frankly I
do not remember with certainty if there are overlaps between tombstones
from previous delete and the keys in question. In retrospect - it was
big failure on my part not to take note of those.

After the second instance of the set insertion - there were _no_
more deletions.

So, in summary:

1) Inserted the set
2) Deleted the set
3) 6 hours passed
4) Inserted the set
5) Observed the problem

What is your delete_mode setting, please 
(http://docs.basho.com/riak/latest/ops/advanced/configs/configuration-files/)?


It is not configured explicitly, so I am assuming the default 3 second delay.


>
Did the second insert do a fetch to get a tombstone vclock before trying to 
overwrite the key, or a PUT with an empty vclock?
>

PUT with an empty vclock.



Now, I understand there may be a replication lag, but this state has
remained for over 3 days now.

"What is fucked, and why?" :)

Good question.


I was hoping this list would appreciate the reference :)


Could you provide some more details to help me figure it out: How many
nodes are you running?

5


Can you provide an example of the 2i queries you're running?

This is how I am testing it:

    Compare = fun(PID, Bucket) ->
        B = Bucket,
        L1 = riakc_pb_socket:get_index(PID, B, {binary_index, "bucket"}, B),
        L2 = riakc_pb_socket:get_index(PID, B, {binary_index, "bucket"}, B),
        io:format("L1: ~b, L2: ~b~n",[length(L1), length(L2)]),
        Diff_L1_L2 = L1 -- L2,
        Diff_L2_L1 = L2 -- L1,
        io:format("=== L1 -- L2 ===~n~p~n~n", [Diff_L1_L2]),
        io:format("=== L2 -- L1 ===~n~p~n~n", [Diff_L2_L1]),
        Fetch = fun(Key) ->
            case riakc_pb_socket:get(PID, B, Key) of
                {ok, _}    -> io:format("FOUND: ~p~n", [Key]);
                {error, _} -> io:format("NOT FOUND: ~p~n", [Key])
            end
        end,
        io:format("=== L1 -- L2 ===~n"),
        lists:foreach(Fetch, Diff_L1_L2),
        io:format("=== L2 -- L1 ===~n"),
        lists:foreach(Fetch, Diff_L2_L1)
    end.

Which results in differences _sometimes_, but _always_ fails on get.


If this is just a dev cluster, can you verify the keys are present /
absent using either a range 2i $keys query, or a key list, please?


Unfortunately this is prod, so brute-force key list is out of the
question.

Running:
    curl "http://127.0.0.1:8098/buckets/$bucket/index/\$keys_bin/0/z";

Returns:
    {"keys":[]}




_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to