from:"ganterm"

Spark 1.3 UDF ClassNotFoundException

2015-04-02 Thread ganterm

Hello, I started to use the dataframe API in Spark 1.3 with Scala. I am trying to implement a UDF and am following the sample here: https://spark.apache.org/docs/1.3.0/api/scala/index.html#org.apache.spark.sql.UserDefinedFunction meaning val predict = udf((score: Double) = if (score 0.5)

Re: Tableau beta connector

2015-02-18 Thread ganterm

Ashutosh, Were you able to figure this out? I am having the exact some question. I think the answer is to use Spark SQL to create/load a table in Hive (e.g. execute the HiveQL CREATE TABLE statement) but I am not sure. Hoping for something more simple than that. Anybody? Thanks! -- View

Re: Spark streaming - tracking/deleting processed files

2015-02-04 Thread ganterm

(:); String name = nT[0]; // name is the path of the file picked for processing. the processing logic can be inside this loop. once //done you can delete the file using the path in the variable name } } Thanks. On Fri, Jan 30, 2015 at 11:37 PM, ganterm [via Apache Spark User List] [hidden

Spark streaming - tracking/deleting processed files

2015-01-30 Thread ganterm

We are running a Spark streaming job that retrieves files from a directory (using textFileStream). One concern we are having is the case where the job is down but files are still being added to the directory. Once the job starts up again, those files are not being picked up (since they are not