ETL and workflow management on Spark

2014-05-22 Thread William Kang
Hi, We are moving into adopting the full stack of Spark. So far, we have used Shark to do some ETL work, which is not bad but is not prefect either. We ended writing UDF and UDGF, UDAF that can be avoided if we could use Pig. Do you have any suggestions with the ETL solution in Spark stack? And

Re: ETL and workflow management on Spark

2014-05-22 Thread Derek Schoettle
unsubscribe From: William Kang weliam.cl...@gmail.com To: user@spark.apache.org Date: 05/22/2014 10:50 AM Subject:ETL and workflow management on Spark Hi, We are moving into adopting the full stack of Spark. So far, we have used Shark to do some ETL work, which is not bad

Re: ETL and workflow management on Spark

2014-05-22 Thread Mayur Rustagi
Hi, We are in process of migrating Pig on spark. What is your currrent Spark setup? Version cluster management that you use? Also what is the datasize you are working with right now. Regards Mayur Mayur Rustagi Ph: +1 (760) 203 3257 http://www.sigmoidanalytics.com @mayur_rustagi