Re: spark 1.3.1

Deng Ching-Mallete Mon, 04 May 2015 07:04:10 -0700

Hi,

I think you need to import "org.apache.spark.sql.types.DataTypes" instead
of "org.apache.spark.sql.types.DataType" and use that instead to access the
StringType..


HTH,
Deng

On Mon, May 4, 2015 at 9:37 PM, Saurabh Gupta <saurabh.gu...@semusi.com>
wrote:

> I am really new to this but what should I look into maven logs?
>
> I have tried mvn package -X -e
>
> SHould I show the full trace?
>
>
>
> On Mon, May 4, 2015 at 6:54 PM, Driesprong, Fokko <fo...@driesprong.frl>
> wrote:
>
>> Hi Saurabh,
>>
>> Did you check the log of maven?
>>
>> 2015-05-04 15:17 GMT+02:00 Saurabh Gupta <saurabh.gu...@semusi.com>:
>>
>>> HI,
>>>
>>> I am trying to build a example code given at
>>>
>>> https://spark.apache.org/docs/latest/sql-programming-guide.html#interoperating-with-rdds
>>>
>>> code is:
>>>
>>> // Import factory methods provided by DataType.import 
>>> org.apache.spark.sql.types.DataType;// Import StructType and 
>>> StructFieldimport org.apache.spark.sql.types.StructType;import 
>>> org.apache.spark.sql.types.StructField;// Import Row.import 
>>> org.apache.spark.sql.Row;
>>> // sc is an existing JavaSparkContext.SQLContext sqlContext = new 
>>> org.apache.spark.sql.SQLContext(sc);
>>> // Load a text file and convert each line to a JavaBean.JavaRDD<String> 
>>> people = sc.textFile("examples/src/main/resources/people.txt");
>>> // The schema is encoded in a stringString schemaString = "name age";
>>> // Generate the schema based on the string of schemaList<StructField> 
>>> fields = new ArrayList<StructField>();for (String fieldName: 
>>> schemaString.split(" ")) {
>>>   fields.add(DataType.createStructField(fieldName, DataType.StringType, 
>>> true));}StructType schema = DataType.createStructType(fields);
>>> // Convert records of the RDD (people) to Rows.JavaRDD<Row> rowRDD = 
>>> people.map(
>>>   new Function<String, Row>() {
>>>     public Row call(String record) throws Exception {
>>>       String[] fields = record.split(",");
>>>       return Row.create(fields[0], fields[1].trim());
>>>     }
>>>   });
>>> // Apply the schema to the RDD.DataFrame peopleDataFrame = 
>>> sqlContext.createDataFrame(rowRDD, schema);
>>> // Register the DataFrame as a 
>>> table.peopleDataFrame.registerTempTable("people");
>>> // SQL can be run over RDDs that have been registered as tables.DataFrame 
>>> results = sqlContext.sql("SELECT name FROM people");
>>> // The results of SQL queries are DataFrames and support all the normal RDD 
>>> operations.// The columns of a row in the result can be accessed by 
>>> ordinal.List<String> names = results.map(new Function<Row, String>() {
>>>   public String call(Row row) {
>>>     return "Name: " + row.getString(0);
>>>   }}).collect();
>>>
>>> my pom file looks like:
>>>
>>> <dependencies>
>>>         <dependency>
>>>             <groupId>org.apache.spark</groupId>
>>>             <artifactId>spark-core_2.10</artifactId>
>>>             <version>1.3.1</version>
>>>         </dependency>
>>>         <dependency>
>>>             <groupId>org.apache.spark</groupId>
>>>             <artifactId>spark-sql_2.10</artifactId>
>>>             <version>1.3.1</version>
>>>         </dependency>
>>> <dependency>
>>>         <groupId>org.apache.hbase</groupId>
>>>         <artifactId>hbase</artifactId>
>>>         <version>0.94.0</version>
>>> </dependency>
>>>
>>> When I try to mvn package I am getting this issue:
>>> cannot find symbol
>>> [ERROR] symbol:   variable StringType
>>> [ERROR] location: class org.apache.spark.sql.types.DataType
>>>
>>> I have gone through
>>> https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/types/StringType.html
>>>
>>> What is missing here?
>>>
>>>

Re: spark 1.3.1

Reply via email to