Hi Xiangrui,
With 4 ALS iterations it runs fine...If I run 10 I am failing...I believe I
have to cut the lineage chain and call checkpointTrying to follow the
other email chain on checkpointing...
Thanks.
Deb
On Sun, Apr 6, 2014 at 9:08 PM, Xiangrui Meng men...@gmail.com wrote:
Hi Deb,
On the partitioning / id keys. If we would look at hash partitioning, how
feasible will it be to just allow the user and item ids to be strings? A
lot of the time these ids are strings anyway (UUIDs and so on), and it's
really painful to translate between String - Int the whole time.
Are there
Nick,
I already have this code which calls dictionary generation and then maps
string etc to ints...I think the core algorithm should stay in ints...if
you like I can add this code in MFUtils.scalathat's the convention I
followed similar to MLUtils.scala...actually these functions should be
Tachyon is Java 6 compatible from version 0.4. Beside putting input/output
data in Tachyon ( http://tachyon-project.org/Running-Spark-on-Tachyon.html ),
Spark applications can also persist data into Tachyon (
https://github.com/apache/spark/blob/master/docs/scala-programming-guide.md
).
On Mon,
Hi,
How I contribute to Spark and it's associated projects?
Appreciate the help...
Thanks
Mukesh
This is a good place to start:
https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark
Sujeet
On Mon, Apr 7, 2014 at 9:20 AM, Mukesh G muk...@gmail.com wrote:
Hi,
How I contribute to Spark and it's associated projects?
Appreciate the help...
Thanks
Mukesh
I am using master...
No negative indexes...
If I run with 4 iterations it runs fine and I can generate factors...
With 10 iterations run fails with array index out of bound...
25m users and 3m products are within int limits
Does it help if I can point the logs for both the runs to you ?
Hi all,
The InputStreamsSuite seems to have some serious flakiness issues -- I've
seen the file input stream fail many times and now I'm seeing some actor
input stream test failures (
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13846/consoleFull)
on what I think is an
I met this issue when Jenkins seems to be very busy
On Monday, April 7, 2014, Kay Ousterhout k...@eecs.berkeley.edu wrote:
Hi all,
The InputStreamsSuite seems to have some serious flakiness issues -- I've
seen the file input stream fail many times and now I'm seeing some actor
input
TD - do you know what is going on here?
I looked into this ab it and at least a few of these that use
Thread.sleep() and assume the sleep will be exact, which is wrong. We
should disable all the tests that do and probably they should be re-written
to virtualize time.
- Patrick
On Mon, Apr 7,
There is a JIRA for one of the flakey tests here:
https://issues.apache.org/jira/browse/SPARK-1409
On Mon, Apr 7, 2014 at 11:32 AM, Patrick Wendell pwend...@gmail.com wrote:
TD - do you know what is going on here?
I looked into this ab it and at least a few of these that use
Thread.sleep()
Yes, I will take a look at those tests ASAP.
TD
On Mon, Apr 7, 2014 at 11:32 AM, Patrick Wendell pwend...@gmail.com wrote:
TD - do you know what is going on here?
I looked into this ab it and at least a few of these that use
Thread.sleep() and assume the sleep will be exact, which is
Hi Deb,
It would be helpful if you can attached the logs. It is strange to see
that you can make 4 iterations but not 10.
Xiangrui
On Mon, Apr 7, 2014 at 10:36 AM, Debasish Das debasish.da...@gmail.com wrote:
I am using master...
No negative indexes...
If I run with 4 iterations it runs
Hi,
From my testing of Spark Streaming with Flume, it seems that there's
only one of the Spark worker nodes that runs a Flume Avro RPC server to
receive messages at any given time, as opposed to every Spark worker
running an Avro RPC server to receive messages. Is this the case? Our
use-case
You can configure your sinks to write to one or more Avro sources in a
load-balanced configuration.
https://flume.apache.org/FlumeUserGuide.html#flume-sink-processors
mfe
On Mon, Apr 7, 2014 at 3:19 PM, Christophe Clapp
christo...@christophe.ccwrote:
Hi,
From my testing of Spark Streaming
I don't see why not. If one were doing something similar with straight
Flume, you'd start an agent on each node you care to receive Avro/RPC
events. In the absence of clearer insight to your use case, I'm puzzling
just a little why it's necessary for each Worker to be its own receiver,
but there's
Cool. I'll look at making the code change in FlumeUtils and generating a
pull request.
As far as the use case, the volume of messages we have is currently about
30 MB per second which may grow to over what a 1 Gbit network adapter can
handle.
- Christophe
On Apr 7, 2014 1:51 PM, Michael Ernest
Hi guys,
The latest PR uses Breeze's L-BFGS implement which is introduced by
Xiangrui's sparse input format work in SPARK-1212.
https://github.com/apache/spark/pull/353
Now, it works with the new sparse framework!
Any feedback would be greatly appreciated.
Thanks.
Sincerely,
DB Tsai
Hi Sujeet,
Thanks. I went thru the website and looks great. Is there a list of
items that I can choose from, for contribution?
Thanks
Mukesh
On Mon, Apr 7, 2014 at 10:14 PM, Sujeet Varakhedi
svarakh...@gopivotal.comwrote:
This is a good place to start:
I’d suggest looking for the issues labeled “Starter” on JIRA. You can find them
here:
https://issues.apache.org/jira/browse/SPARK-1438?jql=project%20%3D%20SPARK%20AND%20labels%20%3D%20Starter%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)
Matei
On Apr 7, 2014, at 9:45 PM,
20 matches
Mail list logo