at 12:15 PM, Aram Mkrtchyan
aram.mkrtchyan...@gmail.com wrote:
which are the best practices to submit spark streaming application on
mesos.
I would like to know about scheduler mode.
Is `coarse-grained` mode right solution?
Thanks
which are the best practices to submit spark streaming application on mesos.
I would like to know about scheduler mode.
Is `coarse-grained` mode right solution?
Thanks
We want to migrate our data (approximately 20M rows) from parquet to postgres,
when we are using dataframe writer's jdbc method the execution time is very
large, we have tried the same with batch insert it was much effective.
Is it intentionally implemented in that way?
Hi,
hope this will help you
import org.apache.spark.sql.functions._
import sqlContext.implicits._
import java.sql.Timestamp
val df = sc.parallelize(Array((date1, date2))).toDF(day1, day2)
val dateDiff = udf[Long, Timestamp, Timestamp]((value1, value2) =
Hi,
We want to have Marathon starting and monitoring Chronos, so that when
Chronos based Spark job fails, marathon automatically restarts them in
scope of Chronos. Will this approach work if we start Spark jobs as shell
scripts from Chronos or Marathon?
Hi.
I'm trying to trigger DataFrame's save method in parallel from my driver.
For that purposes I use ExecutorService and Futures, here's my code:
val futures = [1,2,3].map( t = pool.submit( new Runnable {
override def run(): Unit = {
val commons = events.filter(_._1 == t).map(_._2.common)
Trying to build recommendation system using Spark MLLib's ALS.
Currently, we're trying to pre-build recommendations for all users on daily
basis. We're using simple implicit feedbacks and ALS.
The problem is, we have 20M users and 30M products, and to call the main
predict() method, we need to
that were or are
likely to be active soon. (Or compute on the fly.) Is anything like
that an option?
On Wed, Mar 18, 2015 at 7:13 AM, Aram Mkrtchyan
aram.mkrtchyan...@gmail.com wrote:
Trying to build recommendation system using Spark MLLib's ALS.
Currently, we're trying to pre-build
, 2015 at 8:04 AM, Aram Mkrtchyan
aram.mkrtchyan...@gmail.com wrote:
Thanks much for your reply.
By saying on the fly, you mean caching the trained model, and querying it
for each user joined with 30M products when needed?
Our question is more about the general approach, what if we have