>>>> line.split("\n,")).map(word => (word, 1)).reduceByKey(_ + _)
>>>>>>>>>>>> v: org.apache.spark.streaming.dstream.DStream[(String, Int)] =
>>>>>>>>>>>> org.apache.spark.streaming.dstream.ShuffledDStream
>>>>>>>> :43: error: value collect is not a member of
>>>>>>>>>>> org.apache.spark.streaming.dstream.DStream[(String, Int)]
>>>>>>>>>>> val v = lines.filter(_.contains("ASE 15")).filter(_
adeh
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> LinkedIn *
>>>>>>>>>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8
ril 2016 at 16:01, Ted Yu <yuzhih...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> bq. is not a member of (String, String)
>>>>>>>>>>
>>>>>>>>>> As shown above, conta
;>>>>> Thank you gents.
>>>>>>>>>>
>>>>>>>>>> That should "\n" as carriage return
>>>>>>>>>>
>>>>>>>>>> OK I am using spark streaming to analyse the mess
t;>>>>>>>> import org.apache.spark.streaming._
>>>>>>>>> import org.apache.spark.streaming.kafka.KafkaUtils
>>>>>>>>> //
>>>>>>>>> scala> val sparkConf = new SparkConf().
>>>>>>>>
quot; )
>>>>>>> kafkaParams: scala.collection.immutable.Map[String,String] =
>>>>>>> Map(bootstrap.servers -> rhes564:9092, schema.registry.url ->
>>>>>>> http://rhes564:8081, zookeeper.connect -> rhes564:2181, group.id ->
>
ing,
>>>>>> StringDecoder, StringDecoder](ssc, kafkaParams, topic)
>>>>>> messages: org.apache.spark.streaming.dstream.InputDStream[(String,
>>>>>> String)] =
>>>>>> org.apache.spark.streaming.kafka.DirectKafkaInputDStream@5d8ccb6c
>>
gt;>>> This part is tricky
>>>>>
>>>>> scala> val showlines = messages.filter(_ contains("ASE 15")).filter(_
>>>>> contains("UPDATE INDEX STATISTICS")).flatMap(line =>
>>>>> line.sp
")).filter(_
>>>> contains("UPDATE INDEX STATISTICS")).flatMap(line =>
>>>> line.split("\n,")).map(word => (word, 1)).reduceByKey(_ +
>>>> _).collect.foreach(println)
>>>>
>>>>
>>>> How does one refer to the c
gt;> <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>>
>> On 3 April 2016 at 15:32, Ted Yu <yuzhih...@gmail.com> wrote:
>>
split"\t," splits the filter by carriage return
>>
>> Minor correction: "\t" denotes tab character.
>>
>> On Sun, Apr 3, 2016 at 7:24 AM, Eliran Bivas <elir...@iguaz.io> wrote:
>>
>>> Hi Mich,
>>>
>>> 1. The first undersco
e() results in a collection of strings)
>> 2. You're correct. No need for it.
>> 3. Filter is expecting a Boolean result. So you can merge your contains
>> filters to one with AND (&&) statement.
>> 4. Correct. Each character in split() is used as a divider.
>>
>
> Eliran Bivas
>
> *From:* Mich Talebzadeh <mich.talebza...@gmail.com>
> *Sent:* Apr 3, 2016 15:06
> *To:* Eliran Bivas
> *Cc:* user @spark
> *Subject:* Re: multiple splits fails
>
> Hi Eliran,
>
> Many thanks for your input on this.
>
> I thought about
Correct. Each character in split() is used as a divider.
Eliran Bivas
From: Mich Talebzadeh <mich.talebza...@gmail.com>
Sent: Apr 3, 2016 15:06
To: Eliran Bivas
Cc: user @spark
Subject: Re: multiple splits fails
Hi Eliran,
Many thanks for your input on this.
I thought about wha
Hi Eliran,
Many thanks for your input on this.
I thought about what I was trying to achieve so I rewrote the logic as
follows:
1. Read the text file in
2. Filter out empty lines (well not really needed here)
3. Search for lines that contain "ASE 15" and further have sentence
Hi Mich,
Few comments:
When doing .filter(_ > “”) you’re actually doing a lexicographic comparison and
not filtering for empty lines (which could be achieved with _.notEmpty or
_.length > 0).
I think that filtering with _.contains should be sufficient and the first
filter can be omitted.
As
Hi,
I am not sure this is the correct approach
Read a text file in
val f = sc.textFile("/tmp/ASE15UpgradeGuide.txt")
Now I want to get rid of empty lines and filter only the lines that contain
"ASE15"
f.filter(_ > "").filter(_ contains("ASE15")).
The above works but I am not sure whether I
18 matches
Mail list logo