[Spark ML] : Implement the Conjugate Gradient method for ALS

2017-11-30 Thread Nate Wendt
The conjugate gradient method has been shown to be very efficient at solving the least squares error problem in matrix factorization: http://www.benfrederickson.com/fast-implicit-matrix-factorization/. This post is motivated by:

RE: Spark on Apache Ingnite?

2016-01-05 Thread nate
We started playing with Ignite back Hadoop, hive and spark services, and looking to move to it as our default for deployment going forward, still early but so far its been pretty nice and excited for the flexibility it will provide for our particular use cases. Would say in general its worth

Powered by Spark page

2015-11-12 Thread Nate Kupp
, Spark SQL, MLLib *Use case: *We are using Spark for supporting analytics on both our relational and event data, building data products, and big data processing. Thanks! -Nate

RE: Benchmark results between Flink and Spark

2015-07-05 Thread nate
Maybe some flink benefits from some pts they outline here: http://flink.apache.org/news/2015/05/11/Juggling-with-Bits-and-Bytes.html Probably if re-ran the benchmarks with 1.5/tungsten line would close the gap a bit(or a lot) with spark moving towards similar style off-heap memory mgmt,

RE: Word2Vec with billion-word corpora

2015-05-19 Thread nate
Might also want to look at Y! post, looks like they are experimenting with similar efforts in large scale word2vec: http://yahooeng.tumblr.com/post/118860853846/distributed-word2vec-on-top-of-pistachio -Original Message- From: Xiangrui Meng [mailto:men...@gmail.com] Sent: Tuesday,

RE: Connecting a PHP/Java applications to Spark SQL Thrift Server

2015-03-03 Thread nate
SparkSQL supports JDBC/ODBC connectivity, so if that's the route you needed/wanted to connect through you could do so via java/php apps. Havent used either so cant speak to the developer experience, assume its pretty good as would be preferred method for lots of third party enterprise

RE: Apache Ignite vs Apache Spark

2015-02-26 Thread nate
Ignite guys spoke at the bigtop workshop last week at Scale, posted slides here: https://cwiki.apache.org/confluence/display/BIGTOP/SCALE13x Couple main pts around comments made during the preso.., although incubating apache (first code drop was last week I believe).., tech is battle tested with

RE: Submit Spark applications from a machine that doesn't have Java installed

2015-01-11 Thread Nate D'Amico
Cant speak to the internals of SparkSubmit and how to reproduce sans jvm, guess would depend if you want/need to support various deployment enviroments (stand-alone, mesos, yarn, etc) If just need YARN, or looking at starting point, might want to look at capabilities of YARN API: