Hi Divya,
I guess the error is thrown from spark-csv. Spark-csv tries to parse string
"null" to double.
The workaround is to add nullValue option, like .option("nullValue",
"null"). But this nullValue feature is not included in current spark-csv
1.3. Just checkout the master of spark-csv and use
Hi Ricky,
In your first try, you are using flatMap. It will give you a flat list of
strings. Then you are trying to map each string to a Row, which definitely
throws an exception.
Following Terry's idea, you are mapping the input to a list of arrays, each
of which contains some strings. Then you
Hi Francis,
From my observation when using spark sql, dataframe.limit(n) does not
necessarily return the same result each time when running Apps.
To be more precise, in one App, the result should be same for the same n,
however, changing n might not result in the same prefix(the result for n =
Hi Dan,
In
https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/feature/HashingTF.scala,
you can see spark uses Utils.nonNegativeMod(term.##, numFeatures) to locate
a term.
It's also mentioned in the doc that Maps a sequence of terms to their
term frequencies
Hi Sarath,
It might be questionable to set num-executors as 64 if you only has 8
nodes. Do you use any action like collect which will overwhelm the
driver since you have a large dataset?
Thanks
On Tue, Apr 28, 2015 at 10:50 AM, sarath sarathkrishn...@gmail.com wrote:
I am trying to train a