Re: Loading JSON dataset with Spark Mllib

2015-02-15 Thread gen tang
Hi, In fact, you can use sqlCtx.jsonFile() which loads a text file storing one JSON object per line as a SchemaRDD. Or you can use sc.textFile() to load the textFile to RDD and then use sqlCtx.jsonRDD() which loads an RDD storing one JSON object per string as a SchemaRDD. Hope it could help

Loading JSON dataset with Spark Mllib

2015-02-15 Thread pankaj channe
Hi, I am new to spark and planning on writing a machine learning application with Spark mllib. My dataset is in json format. Is it possible to load data into spark without using any external json libraries? I have explored the option of SparkSql but I believe that is only for interactive use or