Hi Sea,
For using Spark SQL you will need to create DataFrame from the file and
then execute select * on dataframe.
In your case you will need to do something like this
JavaRDD<String> DF = context.textFile("path");
JavaRDD<Row> rowRDD3 = DF.map(new Function<String, Row>() {
public Row call(String record) throws Exception {
String[] fields = record.split("\001");
Row createRow = createRow(fields);
return createRow;
}
});
DataFrame ResultDf3 = hiveContext.createDataFrame(rowRDD3, schema);
ResultDf3.registerTempTable("test")
hiveContext.sql("select * from test");
You will need to create schema for the file first just like how you have
created for csv file.
On Friday 23 September 2016 12:26 PM, Sea wrote:
Hi, I want to run sql directly on files, I find that spark has
supported sql like select * from csv.`/path/to/file`, but files may
not be split by ','. Maybe it is split by '\001', how can I specify
delimiter?
Thank you!