The latest version of MLlib has it built in no?
J
Sent from my iPhone
On Nov 30, 2014, at 9:36 AM, shahab shahab.mok...@gmail.com wrote:
Hi,
I just wonder if there is any implementation for Item-based Collaborative
Filtering in Spark?
best,
/Shahab
You could also use the jodatime library, which has a ton of great other
options in it.
J
ᐧ
*JIMMY MCERLAIN*
DATA SCIENTIST (NERD)
*. . . . . . . . . . . . . . . . . .*
*IF WE CAN’T DOUBLE YOUR SALES,*
*ONE OF US IS IN THE WRONG BUSINESS.*
*E*: ji...@sellpoints.com
*M*: *510.303.7751
I have used Oozie for all our workflows with Spark apps but you will have
to use a java event as the workflow element. I am interested in anyones
experience with Luigi and/or any other tools.
On Mon, Nov 10, 2014 at 10:34 AM, Adamantios Corais
adamantios.cor...@gmail.com wrote:
I have some
can you be more specific what version of spark, hive, hadoop, etc...
what are you trying to do? what are the issues you are seeing?
J
ᐧ
*JIMMY MCERLAIN*
DATA SCIENTIST (NERD)
*. . . . . . . . . . . . . . . . . .*
*IF WE CAN’T DOUBLE YOUR SALES,*
*ONE OF US IS IN THE WRONG BUSINESS
and over again to fit models so its pulled into memory once then
basically analyzed through the algos... other DBs systems are reading and
writing to disk repeatedly and are thus slower, such as mahout (though its
getting ported over to Spark as well to compete with MLlib)...
J
ᐧ
*JIMMY MCERLAIN
What ODBC driver are you using? We recently got the Hortonworks JODBC drivers
working on a Windows box but was having issues with Mac
Sent from my iPhone
On Oct 30, 2014, at 4:23 AM, Bojan Kostic blood9ra...@gmail.com wrote:
I'm testing beta driver from Databricks for Tableua.
And
Watch the app manager it should tell you what's running and taking awhile... My
guess it's a distinct function on the data.
J
Sent from my iPhone
On Oct 30, 2014, at 8:22 AM, peng xia toxiap...@gmail.com wrote:
Hi,
Previous we have applied SVM algorithm in MLlib to 5 million records
, Xiangrui Meng men...@gmail.com wrote:
DId you cache the data and check the load balancing? How many
features? Which API are you using, Scala, Java, or Python? -Xiangrui
On Thu, Oct 30, 2014 at 9:13 AM, Jimmy ji...@sellpoints.com wrote:
Watch the app manager it should tell you what's
is working fine... it leads me to believe that it is a bug
within the REPL for 1.1
Can anyone else confirm this?
ᐧ
*JIMMY MCERLAIN*
DATA SCIENTIST (NERD)
*. . . . . . . . . . . . . . . . . .*
*IF WE CAN’T DOUBLE YOUR SALES,*
*ONE OF US IS IN THE WRONG BUSINESS.*
*E*: ji
Does anyone know anything re: this error? Thank you!
On Wed, Oct 15, 2014 at 3:38 PM, Jimmy Li jimmy...@bluelabs.com wrote:
Hi there, I'm running spark on ec2, and am running into an error there
that I don't get locally. Here's the error:
11335 [handle-read-write-executor-3] ERROR
])
java.nio.channels.ClosedChannelException
Does anyone know what might be causing this? Spark is running on my ec2
instances.
Thanks,
Jimmy
then
pushing them out to the cluster and pointing them to corresponding
dependent jars
Sorry I cannot be more help!
J
ᐧ
*JIMMY MCERLAIN*
DATA SCIENTIST (NERD)
*. . . . . . . . . . . . . . . . . .*
*IF WE CAN’T DOUBLE YOUR SALES,*
*ONE OF US IS IN THE WRONG BUSINESS.*
*E*: ji...@sellpoints.com
Having the exact same error with the exact same jar Do you work for
Altiscale? :)
J
Sent from my iPhone
On Oct 13, 2014, at 5:33 PM, Andy Srine andy.sr...@gmail.com wrote:
Hi Guys,
Spark rookie here. I am getting a file not found exception on the --jars.
This is on the yarn
BTW this has always worked for me before until we upgraded the cluster to
Spark 1.1.1...
J
ᐧ
*JIMMY MCERLAIN*
DATA SCIENTIST (NERD)
*. . . . . . . . . . . . . . . . . .*
*IF WE CAN’T DOUBLE YOUR SALES,*
*ONE OF US IS IN THE WRONG BUSINESS.*
*E*: ji...@sellpoints.com
*M*: *510.303.7751
Yeah I'm using 1.0.0 and thanks for taking the time to check!
Sent from my iPhone
On Oct 1, 2014, at 8:48 PM, Xiangrui Meng men...@gmail.com wrote:
Which Spark version are you using? It works in 1.1.0 but not in 1.0.0.
-Xiangrui
On Wed, Oct 1, 2014 at 2:13 PM, Jimmy McErlain ji
Not sure if this is what you are after but its based on a moving average
within spark... I was building an ARIMA model on top of spark and this
helped me out a lot:
http://stackoverflow.com/questions/23402303/apache-spark-moving-average
ᐧ
*JIMMY MCERLAIN*
DATA SCIENTIST (NERD
16 matches
Mail list logo