Daniel,
Currently, having Tachyon will at least help on the input part in this case.
Haoyuan
On Fri, Oct 24, 2014 at 2:01 PM, Daniel Mahler wrote:
> I am trying to convert some json logs to Parquet and save them on S3.
> In principle this is just
>
> import org.apache.spark._
> val sqlContext
I am trying to convert some json logs to Parquet and save them on S3.
In principle this is just
import org.apache.spark._
val sqlContext = new sql.SQLContext(sc)
val data = sqlContext.jsonFile(s3n://source/path/*/*",10e-8)
data.registerAsTable("data")
data.saveAsParquetFile("s3n://target/path)
Th
I am trying to convert some json logs to Parquet and save them on S3.
In principle this is just
import org.apache.spark._
val sqlContext = new sql.SQLContext(sc)
val data = sqlContext.jsonFile(s3n://source/path/*/*",10e-8)
data.registerAsTable("data")
data.saveAsParquetFile("s3n://target/path)
Th