If I understood your question correctly, given the following input:

main_data.txt
{"id": "foo", "some_field": 12354, "score": 0}
{"id": "foobar", "some_field": 12354, "score": 0}
{"id": "baz", "some_field": 12345, "score": 0}

score_data.txt
{"id": "foo", "score": 1}
{"id": "foobar", "score": 20}

you want the following output

{"id": "foo", "some_field": 12354, "score": 1}
{"id": "foobar", "some_field": 12354, "score": 20}
{"id": "baz", "some_field": 12345, "score": 0}

If that is correct, you can do a LEFT OUTER join on the two relations.

main = LOAD 'main_data.txt' as (id: chararray, some_field: int, score: int);
scores = LOAD 'score_data.txt' as (id: chararray, score: int);
both = JOIN main by id LEFT, scores by id;
final = FOREACH both GENERATE main::id as id, main::some_field as
some_field, (scores::score == null ? main::score : scores::score) as
score;
dump final;

After the join, check to see if the scores::score is null… if it is, choose
the default of main::score… if not choose scores::score.

Hope this helps!

Reply via email to