[
https://issues.apache.org/jira/browse/CRUNCH-676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16768731#comment-16768731
]
Ryan Kemp commented on CRUNCH-676:
----------------------------------
[~jwills] I have an api where the consumer passes in a Pipeline. I'd like to
park temporary data to the Crunch working directory, and have Crunch clean up
the path once my consumer executes pipeline.done(). I'm just looking for a
temp working directory from a Pipeline without having to cast.
{code:java}
public static Path createTempPath(final Pipeline pipeline) {
final Path path;
if (DistributedPipeline.class.isAssignableFrom(pipeline.getClass())) {
final DistributedPipeline distributedPipeline = (DistributedPipeline)
pipeline;
// sample path: /tmp/crunch-1635159168/p1
path = distributedPipeline.createTempPath();
} else {
// this isn't doing the exact same thing as distributedPipeline...
// sample path: /tmp/crunch-1635159168
final String baseDir = pipeline.getConfiguration().get(TMP_DIR,
DEFAULT_TEMP_DIR);
path = new Path(baseDir, "crunch-" + (new Random().nextInt() &
Integer.MAX_VALUE));
}
LOGGER.info("Created Crunch temp path {}", path);
return path;
}{code}
> Declare createTempPath() in the Pipeline interface
> --------------------------------------------------
>
> Key: CRUNCH-676
> URL: https://issues.apache.org/jira/browse/CRUNCH-676
> Project: Crunch
> Issue Type: New Feature
> Components: Core
> Reporter: Ryan Kemp
> Assignee: Josh Wills
> Priority: Minor
>
> DistributedPipeline declares public createTempPath(), but MemPipeline does
> not. Can createTempPath() be declared in Pipeline and implemented in
> MemPipeline?
> The method in MemPipeline could be extremely similar to DistributedPipeline.
> * conf.get("crunch.tmp.dir", "/tmp");
> * Maintain a
> [tempFileIndex|https://github.com/apache/crunch/blob/apache-crunch-0.15.0/crunch-core/src/main/java/org/apache/crunch/impl/dist/DistributedPipeline.java#L82]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)