I'm trying to solve a Word-Count like problem, the difference lies in that, I
need the count of a specific word among a specific timespan in a social message
stream.
My data is in the format of (time, message), and I transformed (flatMap etc.)
it into a series of (time, word_id), the time is
these datetime objects implement a the notion of equality you'd
expect? (This may be a dumb question; I'm thinking of the equivalent
of equals() / hashCode() from the Java world.)
On Sat, Apr 18, 2015 at 4:17 PM, SecondDatke
lovejay-lovemu...@outlook.com wrote:
I'm trying to solve a Word
() ?
What release of Spark are you using ?
Cheers
On Sat, Apr 18, 2015 at 8:17 AM, SecondDatke lovejay-lovemu...@outlook.com
wrote:
I'm trying to solve a Word-Count like problem, the difference lies in that, I
need the count of a specific word among a specific timespan in a social message
stream
?
To: lovejay-lovemu...@outlook.com
CC: user@spark.apache.org
Do these datetime objects implement a the notion of equality you'd
expect? (This may be a dumb question; I'm thinking of the equivalent
of equals() / hashCode() from the Java world.)
On Sat, Apr 18, 2015 at 4:17 PM, SecondDatke
Well, maybe a Linux configure problem...
I have a cluster that is about to expose to the public, and I want everyone
that uses my cluster owns a user (without permissions of sudo, etc.)(e.g.
'guest'), and is able to submit tasks to Spark, which working on Mesos that
running with a different,
I'm trying to apply Spark to a NLP problem that I'm working around. I have near
4 million tweets text and I have converted them into word vectors. It's pretty
sparse because each message just has dozens of words but the vocabulary has
tens of thousand words.
These vectors should be loaded each