Thanks Abe. SQOOP-2861 <https://issues.apache.org/jira/browse/SQOOP-2861> has been created for this feature request.
On Tue, Mar 1, 2016 at 10:42 AM, Abraham Fine <[email protected]> wrote: > At this time sqoop2 does not provide a mechanism to configure the job’s > scheduler pool or provide a mechanism for passing through arbitrary > configuration to the map/reduce job. > > I am not sure that configuring a scheduler pool is something that we would > want to specifically prompt for in the shell but I definitely could see the > use case for passing through job specific mapreduce configuration. > > Please feel free to open a JIRA for this feature request. > > Thanks, > Abe > > > On Feb 29, 2016, at 10:33 AM, Scott Kuehn <[email protected]> wrote: > > Does sqoop2 provide a mechanism to configure jobs to run in ad-hoc > scheduler pools? By ad-hoc, I mean a scheduler pool that is not necessarily > the same as the pool configured in the sqoop2 server's mapred-site.xml. > > The use case is to limit cluster-wide sqoop access to a particular FROM > resource. While the throttling extractor mechanics are useful for > preventing a single job from saturating the resource, this mechanism cannot > limit aggregate resource access across jobs. I'd like to allocate a yarn > scheduler pool that caps the vcores and ram available for jobs accessing > the particularly sensitive database. A subset of sqoop2 jobs would be > configured to run in this pool, whereas other sqoop2 jobs would fall back > to the default pool configured for the sqoop2 server. > > A glance at the code and some recent configuration work > <https://cwiki.apache.org/confluence/display/SQOOP/Sqoop+Config+as+Top+Level+Entity> > suggests > this functionality isn't available today. I'm interested to hear if this is > the case, and whether or not any reasonable workarounds exist. I'm using > apache sqoop 1.99.6-RC2. > > >
