[ 
https://issues.apache.org/jira/browse/SPARK-11589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen updated SPARK-11589:
------------------------------
    Priority: Major  (was: Blocker)

[~hunttang] Also, only committers set Blocker in general

> Cannot create files by Hadoop FileSystem in JavaRDD.foreach
> -----------------------------------------------------------
>
>                 Key: SPARK-11589
>                 URL: https://issues.apache.org/jira/browse/SPARK-11589
>             Project: Spark
>          Issue Type: Bug
>    Affects Versions: 1.5.1
>            Reporter: Hunt Tang
>
> I'm using Hadoop 2.6.0, Spark 1.5.1.
> I wanna output zip files by using Hadoop DistributedFileSystem in 
> JavaRDD.foreach, the sample code is as followings. The code runs normally 
> (both test code 1 and 2) if I set master to local mode, however, when I set 
> it to yarn mode (no matter yarn-client or yarn-cluster), the files of test 
> code 2 could not be successfully created, and no any error log printed.
> {code:java}
>         Configuration fileSystemConf = new Configuration();
>         fileSystemConf.set("fs.defaultFS", "hdfs://myHostname:9000");
>         // Test code 1
>         FileSystem fsTemp = FileSystem.get(fileSystemConf);
>         FSDataOutputStream fosTemp = fsTemp.create(new Path(output + 
> "test.zip"), true);
>         ZipOutputStream zosTemp = new ZipOutputStream(fosTemp);
>         zosTemp.putNextEntry(new ZipEntry("task.json"));
>         zosTemp.write(new byte[1]);
>         zosTemp.close();
>         fosTemp.close();
>         // Test code 2
>         JavaPairRDD<Integer, Iterable<String>> packageImageIdGroup = 
> packageImageIdsMap.groupByKey();
>         packageImageIdGroup.foreach((packageImageIdsPair) -> {
>             String packageName = String.format("%03d", 
> packageImageIdsPair._1());
>             String filename = output + packageName + ".zip";
>             Iterable<String> packageImageIds = packageImageIdsPair._2();
>             FileSystem fs = FileSystem.get(fileSystemConf);
>             FSDataOutputStream fos = fs.create(new Path(filename), true);
>             ZipOutputStream zos = new ZipOutputStream(fos);
>             for (String imageId : packageImageIds) {
>                 String imageFilename = packageName + "/Image/" + imageId + 
> ".jpg";
>                 zos.putNextEntry(new ZipEntry(imageFilename));
>                 zos.write(new byte[1]);
>             }
>             zos.close();
>             fos.close();
>         });
> {code}
> I know I should put some streaming instances into try(), but please disregard 
> this for now.
> Is there any clue why the code could not work in yarn mode? I'll very 
> appreciate if someone can give help! Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to