Hi Abe
Yes, I this solves my problem.
Thank you very much for the response
Michal
On 03/14/2016 10:31 PM, Abraham Fine wrote:
Hi Michal-
Currently there is no “official” mechanism for setting yarn settings within
Sqoop 2.
If you are comfortable changing the yarn settings for all jobs launched by this
sqoop server, sqoop loads all -site.xml files for configuration. So you could
add another file that specifies the memory settings that you want if you do not
want to change them cluster wide.
Let me know if this solves your problem.
Thanks,
Abe
On Mar 14, 2016, at 7:43 AM, Michal Vince <[email protected]> wrote:
Hi guys
I`m trying to get a grip on sqoop2. I`m running hadoop2 cluster with 2 nodes,
for yarn there is 28GB od memory, 24 cores and 4 disks available
minimum allocation resources for yarn container are 1024 of ram, 1 core and 0.
disks
I`m tryng to dump my relatively large table to hdfs - 25M rows, 33 columns,
stored in maria DB with tokuDB engine and using sqoops generic jdbc driver
every time I try to run job in sqoop2 I`m getting
2016-03-14 13:07:29,427 INFO [AsyncDispatcher event handler]
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report
from attempt_1457009691885_0029_m_000004_0: Container
[pid=6536,containerID=container_e09_1457009691885_0029_01_000008] is running
beyond physical memory limits. Current usage: 1.0 GB of 1 GB physical memory
used; 2.6 GB of 2.1 GB virtual memory used. Killing container.
I tried to use different number of executors from 5 to 10k but with no luck
It looks like to me sqoop is allocating minimum resources for container, is
there any way how to configure sqoop to allocate more memory for this job? Or
the only way is to change yarn settings?
thanks a lot