Sure, that code looks like it does sort of what you describe but it's
mixed up in a few ways. It looks like you only want to operate on
words that start with SECRETWORD, but then you are prepending acct and
_ in the code but expecting something appending in the result. You
also seem like you want to sum by key so there needs to be a
reduceByKeyAndWindow in here somewhere, or else a foreachRDD and
reduceByKey. The result is not a sequence of (word,count), but a
sequence of RDDs of (word,count).

On Wed, Oct 29, 2014 at 11:40 PM, Harold Nguyen <har...@nexgate.com> wrote:
> Hi Sean,
>
> I'd just like to take the first "word" of every line, and use it as a
> variable for later. Is there a way to do that?
>
> Here's the gist of what I want to do:
>
>   val lines = KafkaUtils.createStream(ssc, "localhost:2181", "test",
> Map("test" -> 10)).map(_._2)
>   val words = lines.flatMap(_.split(" "))
>   val acct = words.filter(word => word.startsWith("SECRETWORD"))
>   val pairs = words.map(word => (acct+"_"+word, 1))
>
> Take all lines coming into Kafka, and add the word 'acct' to each word.
>
> As an example, here is a line:
>
> "hello world you are SECRETWORDthebest hello world"
>
> And it should do this:
>
> (SECRETWORDthebest_hello, 2), (SECRETWORDthebest_world, 2),
> (SECRETWORDthebest_you, 1), etc...
>
> Harold
>
>
> On Wed, Oct 29, 2014 at 3:36 PM, Sean Owen <so...@cloudera.com> wrote:
>>
>> What would it mean to make a DStream into a String? it's inherently a
>> sequence of things over time, each of which might be a string but
>> which are usually RDDs of things.
>>
>> On Wed, Oct 29, 2014 at 11:15 PM, Harold Nguyen <har...@nexgate.com>
>> wrote:
>> > Hi all,
>> >
>> > How do I convert a DStream to a string ?
>> >
>> > For instance, I want to be able to:
>> >
>> > val myword = words.filter(word => word.startsWith("blah"))
>> >
>> > And use "myword" in other places, like tacking it onto (key, value)
>> > pairs,
>> > like so:
>> >
>> > val pairs = words.map(word => (myword+"_"+word, 1))
>> >
>> > Thanks for any help,
>> >
>> > Harold
>> >
>> >
>> >
>> >
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to