Hi Sea,

For using Spark SQL you will need to create DataFrame from the file and then execute select * on dataframe.
In your case you will need to do something like this

        JavaRDD<String> DF = context.textFile("path");
        JavaRDD<Row> rowRDD3 = DF.map(new Function<String, Row>() {
            public Row call(String record) throws Exception {
                String[] fields = record.split("\001");
                Row createRow = createRow(fields);
                return createRow;
            }
        });
        DataFrame ResultDf3 = hiveContext.createDataFrame(rowRDD3, schema);
        ResultDf3.registerTempTable("test")
        hiveContext.sql("select * from test");

You will need to create schema for the file first just like how you have created for csv file.




On Friday 23 September 2016 12:26 PM, Sea wrote:
Hi, I want to run sql directly on files, I find that spark has supported sql like select * from csv.`/path/to/file`, but files may not be split by ','. Maybe it is split by '\001', how can I specify delimiter?

Thank you!





Reply via email to