Hi, when I use Multi Table/File Inserts commands, some may be not more
effective than run table insert commands separately.
For example,
from pokes
insert overwrite table pokes_count
select bar,count(foo) group by bar
insert overwrite table pokes_sum
select bar,sum(foo) group by bar;
To execute this, 2 map/reduce jobs is needed, which is not less than run the
two command separately:
insert overwrite table pokes_count select bar,count(foo) from pokes
group by bar;
insert overwrite table pokes_sum select bar,sum(foo) from pokes group by
bar;
And the time taken is the same.
But the first one seems only scan the table 'pokes' once, why still need 2
map/reduce jobs? And why the time taken couldnot be less?
Is there any way to make it more effective?
Thanks a lot,
Zhou
This e-mail and its attachments contain confidential information from
HUAWEI, which
is intended only for the person or entity whose address is listed above. Any
use of the
information contained herein in any way (including, but not limited to,
total or partial
disclosure, reproduction, or dissemination) by persons other than the
intended
recipient(s) is prohibited. If you receive this e-mail in error, please
notify the sender by
phone or email immediately and delete it!