they also get sent over the wire with things like job submissions, so can make 
things slower.

in my little grumpy project, https://github.com/steveloughran/grumpy , I 
actually stuck the groovy scripts into the config files as strings, so they'd 
be submitted as jobs; 

the mapper & reducer would simply read the config, parse it as a method under 
the mapper context, then run it

https://github.com/steveloughran/grumpy/blob/master/src/main/groovy/org/apache/hadoop/grumpy/scripted/ScriptedMapper.groovy


> On 15 Jun 2015, at 22:35, Colin P. McCabe <cmcc...@apache.org> wrote:
> 
> Much like zombo.com, the only limit is yourself.
> 
> But huge Configuration objects are going to be really inefficient, so
> I would look elsewhere for storing lots of data.
> 
> best,
> Colin
> 
> On Fri, Jun 12, 2015 at 7:30 PM, Sitaraman Vilayannur
> <vrsitaramanietfli...@gmail.com> wrote:
>> Thanks Allen, what is the total size limit?
>> Sitaraman
>> 
>> 
>> On Fri, Jun 12, 2015 at 10:53 PM, Allen Wittenauer <a...@altiscale.com> 
>> wrote:
>> 
>>> 
>>> On Jun 12, 2015, at 12:37 AM, Sitaraman Vilayannur <
>>> vrsitaramanietfli...@gmail.com> wrote:
>>> 
>>>> Hi,
>>>> What is the limit on the number of properties that can be set using
>>>> set(String s1, String s2) on the Configuration object for hadoop?
>>>> Is this limit configurable if so what is the maximum that can be set?
>>> 
>>>        It's a "total size of the conf" limit, not a "number of" limit.
>>> 
>>>        In general, you shouldn't pack it full of stuff as calling
>>> Configuration is expensive.  Use a side-input/distributed cache file for
>>> mass quantities of bits.
>>> 
>>> 
>>> 

Reply via email to