It appears that my issue was caused by the missing sections I
mentioned in the second post. I ran a job with these settings, and my
job finished in < 6 hours. Thanks for your suggestions because I have
further ideas regarding issues moving forward.
scan.setCaching(500);// 1 is the default
Hi Chien,
4. From 50-150k per * second * to 100-150k per * minute *, as stated
above, so reads went *DOWN* significantly. I think you must have
misread.
I will take into account some of your other suggestions.
Thanks,
Colin
On Tue, Apr 12, 2016 at 8:19 PM, Chien Le wrote:
> Some things I wou
Some things I would look at:
1. Node statistics, both the mapper and regionserver nodes. Make sure
they're on fully healthy nodes (no disk issues, no half duplex, etc) and
that they're not already saturated from other jobs.
2. Is there a common regionserver behind the remaining mappers/regions? If
I've noticed that I've omitted
scan.setCaching(500);// 1 is the default in Scan, which will
be bad for MapReduce jobs
scan.setCacheBlocks(false); // don't set to true for MR jobs
which appear to be suggestions from examples. Still I am not sure if
this explains the significant request sl
Excuse my double post. I thought I deleted my draft, and then
constructed a cleaner, more detailed, more readable mail.
On Tue, Apr 12, 2016 at 10:26 PM, Colin Kincaid Williams wrote:
> After trying to get help with distcp on hadoop-user and cdh-user
> mailing lists, I've given up on trying to us