Yes, they really should. I'll fix the MemPipeline one to be able to correctly write output to directories.
On Thu, Jun 21, 2012 at 3:23 AM, Rahul Sharma <[email protected]> wrote: > Hi Everyone, > > I believe, Pipeline types are not completely inter-changeable. I wrote > testcases for MRPipeline but the I changed the type to MemPipeiine. > All things went fine but while creating the output file using > writeTextFile, it gave an error with the following stacktrace : > > 1 [main] ERROR com.cloudera.crunch.impl.mem.MemPipeline - > Exception writing target: Text(/home/rahul/crunchOut) > java.io.FileNotFoundException: /home/rahul/crunchOut (Is a directory) > at java.io.FileOutputStream.open(Native Method) > at java.io.FileOutputStream.<init>(FileOutputStream.java:194) > at org.apache.hadoop.fs.RawLocalFileSystem > $LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:189) > at org.apache.hadoop.fs.RawLocalFileSystem > $LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:185) > at > org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java: > 256) > at > org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java: > 237) > at org.apache.hadoop.fs.ChecksumFileSystem > $ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:336) > at > org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java: > 382) > at > org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java: > 365) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:584) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:565) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:472) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:464) > at com.cloudera.crunch.impl.mem.MemPipeline.write(MemPipeline.java: > 148) > at > com.cloudera.crunch.impl.mem.MemPipeline.writeTextFile(MemPipeline.java: > 178) > > > Now, when I looked it out, basically the code there in the > writeTextFile function expects a file while I was passing a folder, > which is required for the MRPipeline. If I pass a file location in > MemPipeline it works but breaks for MRPipeline stating back the > following exception : > > 1 job failure(s) occurred: > com.mylearning.crunch.FirstTest: SeqFile(/tmp/crunch1711673673/ > p1)+top1map+GBK+combine+top1reduce+asText+Text(/home/rahul/crunchOut/ > sample.txt)(class com.mylearning.crunch.FirstTest0): > java.io.IOException: Mkdirs failed to create /home/rahul/crunchOut/ > sample.txt > at > org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java: > 253) > at > org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java: > 237) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:565) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:472) > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:223) > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:157) > at > org.apache.hadoop.fs.RawLocalFileSystem.rename(RawLocalFileSystem.java: > 287) > at > org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java: > 429) > > Basically internally, the getDestFile(Path src, Path dir, int index) > in crunchJob class expects the path to be directory and not a file. > > Shouldn't the two implementations for writeTextFile be in sync ? > > regards > Rahul -- Director of Data Science Cloudera Twitter: @josh_wills
