Hello,
I started to use the dataframe API in Spark 1.3 with Scala.
I am trying to implement a UDF and am following the sample here:
https://spark.apache.org/docs/1.3.0/api/scala/index.html#org.apache.spark.sql.UserDefinedFunction
meaning
val predict = udf((score: Double) = if (score 0.5)
Ashutosh,
Were you able to figure this out? I am having the exact some question.
I think the answer is to use Spark SQL to create/load a table in Hive (e.g.
execute the HiveQL CREATE TABLE statement) but I am not sure. Hoping for
something more simple than that.
Anybody?
Thanks!
--
View
(:);
String name = nT[0]; // name is the path of the file picked for
processing. the processing logic can be inside this loop. once //done you
can delete the file using the path in the variable name
}
}
Thanks.
On Fri, Jan 30, 2015 at 11:37 PM, ganterm [via Apache Spark User List]
[hidden
We are running a Spark streaming job that retrieves files from a directory
(using textFileStream).
One concern we are having is the case where the job is down but files are
still being added to the directory.
Once the job starts up again, those files are not being picked up (since
they are not