Hi, I think you need to import "org.apache.spark.sql.types.DataTypes" instead of "org.apache.spark.sql.types.DataType" and use that instead to access the StringType..
HTH, Deng On Mon, May 4, 2015 at 9:37 PM, Saurabh Gupta <saurabh.gu...@semusi.com> wrote: > I am really new to this but what should I look into maven logs? > > I have tried mvn package -X -e > > SHould I show the full trace? > > > > On Mon, May 4, 2015 at 6:54 PM, Driesprong, Fokko <fo...@driesprong.frl> > wrote: > >> Hi Saurabh, >> >> Did you check the log of maven? >> >> 2015-05-04 15:17 GMT+02:00 Saurabh Gupta <saurabh.gu...@semusi.com>: >> >>> HI, >>> >>> I am trying to build a example code given at >>> >>> https://spark.apache.org/docs/latest/sql-programming-guide.html#interoperating-with-rdds >>> >>> code is: >>> >>> // Import factory methods provided by DataType.import >>> org.apache.spark.sql.types.DataType;// Import StructType and >>> StructFieldimport org.apache.spark.sql.types.StructType;import >>> org.apache.spark.sql.types.StructField;// Import Row.import >>> org.apache.spark.sql.Row; >>> // sc is an existing JavaSparkContext.SQLContext sqlContext = new >>> org.apache.spark.sql.SQLContext(sc); >>> // Load a text file and convert each line to a JavaBean.JavaRDD<String> >>> people = sc.textFile("examples/src/main/resources/people.txt"); >>> // The schema is encoded in a stringString schemaString = "name age"; >>> // Generate the schema based on the string of schemaList<StructField> >>> fields = new ArrayList<StructField>();for (String fieldName: >>> schemaString.split(" ")) { >>> fields.add(DataType.createStructField(fieldName, DataType.StringType, >>> true));}StructType schema = DataType.createStructType(fields); >>> // Convert records of the RDD (people) to Rows.JavaRDD<Row> rowRDD = >>> people.map( >>> new Function<String, Row>() { >>> public Row call(String record) throws Exception { >>> String[] fields = record.split(","); >>> return Row.create(fields[0], fields[1].trim()); >>> } >>> }); >>> // Apply the schema to the RDD.DataFrame peopleDataFrame = >>> sqlContext.createDataFrame(rowRDD, schema); >>> // Register the DataFrame as a >>> table.peopleDataFrame.registerTempTable("people"); >>> // SQL can be run over RDDs that have been registered as tables.DataFrame >>> results = sqlContext.sql("SELECT name FROM people"); >>> // The results of SQL queries are DataFrames and support all the normal RDD >>> operations.// The columns of a row in the result can be accessed by >>> ordinal.List<String> names = results.map(new Function<Row, String>() { >>> public String call(Row row) { >>> return "Name: " + row.getString(0); >>> }}).collect(); >>> >>> my pom file looks like: >>> >>> <dependencies> >>> <dependency> >>> <groupId>org.apache.spark</groupId> >>> <artifactId>spark-core_2.10</artifactId> >>> <version>1.3.1</version> >>> </dependency> >>> <dependency> >>> <groupId>org.apache.spark</groupId> >>> <artifactId>spark-sql_2.10</artifactId> >>> <version>1.3.1</version> >>> </dependency> >>> <dependency> >>> <groupId>org.apache.hbase</groupId> >>> <artifactId>hbase</artifactId> >>> <version>0.94.0</version> >>> </dependency> >>> >>> When I try to mvn package I am getting this issue: >>> cannot find symbol >>> [ERROR] symbol: variable StringType >>> [ERROR] location: class org.apache.spark.sql.types.DataType >>> >>> I have gone through >>> https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/types/StringType.html >>> >>> What is missing here? >>> >>>