You are missing input. Mrconf is not the way to add input files. In spark, try Dataframe read functions or sc.textfile function.
Best Ayan On 23 Aug 2016 07:12, "shamu" <prashant...@hotmail.com> wrote: > Hi All, > I am a newbie to Spark/Hadoop. > I want to read a parquet file and a perform a simple word-count. Below is > my > code, however I get an error: > Exception in thread "main" java.io.IOException: No input paths specified in > job > at > org.apache.hadoop.mapreduce.lib.input.FileInputFormat. > listStatus(FileInputFormat.java:239) > at > org.apache.parquet.hadoop.ParquetInputFormat.listStatus( > ParquetInputFormat.java:349) > at > org.apache.hadoop.mapreduce.lib.input.FileInputFormat. > getSplits(FileInputFormat.java:387) > at > org.apache.parquet.hadoop.ParquetInputFormat.getSplits( > ParquetInputFormat.java:304) > at org.apache.spark.rdd.NewHadoopRDD.getPartitions( > NewHadoopRDD.scala:120) > > Below is my code. I guess I am missing some core concepts wrt hadoop > InputFormats and making it working with spark. Coul d you please explain > the > cause and solution to get this working/ > -----------------------------code > snippet----------------------------------------------------------------- > JavaSparkContext sc = new JavaSparkContext(conf); > org.apache.hadoop.conf.Configuration mrConf = new Configuration(); > mrConf.addResource(inputFile); > JavaPairRDD<String, String> textInputFormatObjectJavaPairRDD = > sc.newAPIHadoopRDD(mrConf, ParquetInputFormat.class, String.class, > String.class); > JavaRDD<String> words = textInputFormatObjectJavaPairRDD.values().flatMap( > new FlatMapFunction<String, String>() { > public Iterable<String> call(String x) { > return Arrays.asList(x.split(",")); > } > }); > long x = words.count(); > > --thanks! > > > > -- > View this message in context: http://apache-spark-user-list. > 1001560.n3.nabble.com/word-count-on-parquet-file-tp27581.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >