Forwarding to common-user to hopefully get more exposure...
---------- Forwarded message ---------- From: Stan Rosenberg <stan.rosenb...@gmail.com> Date: Tue, Jul 31, 2012 at 11:55 AM Subject: Re: task jvm bootstrapping via distributed cache To: mapreduce-u...@hadoop.apache.org I am guessing this is either a well-known problem or an edge case. In any case, would it be a bad idea to designate predetermined output paths? E.g., DistributedCache.addCacheFileInto(uri, conf, outputPath) would attempt to copy the cached file into the specified path resolving to a task's local filesystem. Thanks, stan On Mon, Jul 30, 2012 at 6:23 PM, Stan Rosenberg <stan.rosenb...@gmail.com> wrote: > Hi, > > I am seeking a way to leverage hadoop's distributed cache in order to > ship jars that are required to bootstrap a task's jvm, i.e., before a > map/reduce task is launched. > As a concrete example, let's say that I need to launch with > '-javaagent:/path/profiler.jar'. In theory, the task tracker is > responsible for downloading cached files onto its local filesystem. > However, the absolute path to a given cached file is not known a > priori; however, we need the path in order to configure '-javaagent'. > > Is this currently possible with the distributed cache? If not, is the > use case appealing enough to open a jira ticket? > > Thanks, > > stan