I have written four lines of simple spark program to process data in Phoenix table: queryString = getQueryFullString( );// Get data from Phoenix table select col from table
JavaPairRDD<NullWritable, TestWritable> phRDD = jsc.newAPIHadoopRDD( configuration, PhoenixInputFormat.class, NullWritable.class, TestWritable.class); JavaRDD<Long> rdd = phRDD.map(new Function<Tuple2<NullWritable, TestWritable>, Long>() { @Override//Goal is to scan all the data public Long call(Tuple2<NullWritable, TestWritable> tuple) throws Exception { return 1L; } }); System.out.println(rdd.count()); This program takes 2 hours to process for 2 million record, can anyone help me understand what is wrong. -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org