Re: Machine learning with spark (book code example error)

Fengdong Yu Wed, 14 Oct 2015 01:56:49 -0700

Don’t recommend this code style, you’d better brace the function block.


val testLabels = testRDD.map { case (file, text) => {
  val topic = file.split("/").takeRight(2).head
 newsgroupsMap(topic)
} }


> On Oct 14, 2015, at 15:46, Nick Pentreath <nick.pentre...@gmail.com> wrote:
> 
> Hi there. I'm the author of the book (thanks for buying it by the way :)
> 
> Ideally if you're having any trouble with the book or code, it's best to 
> contact the publisher and submit a query 
> (https://www.packtpub.com/books/content/support/17400 
> <https://www.packtpub.com/books/content/support/17400>) 
> 
> However, I can help with this issue. The problem is that the "testLabels" 
> code needs to be indented over multiple lines:
> 
> val testPath = "/PATH/20news-bydate-test/*"
> val testRDD = sc.wholeTextFiles(testPath)
> val testLabels = testRDD.map { case (file, text) => 
>       val topic = file.split("/").takeRight(2).head
>       newsgroupsMap(topic)
> }
> 
> As it is in the sample code attached. If you copy the whole indented block 
> (or line by line) into the console, it should work - I've tested all the 
> sample code again and indeed it works for me.
> 
> Hope this helps
> Nick
> 
> On Tue, Oct 13, 2015 at 8:31 PM, Zsombor Egyed <egye...@starschema.net 
> <mailto:egye...@starschema.net>> wrote:
> Hi!
> 
> I was reading the ML with spark book, and I was very interested about the 9. 
> chapter (text mining), so I tried code examples. 
> 
> Everything was fine, but in this line:
> val testLabels = testRDD.map { 
> case (file, text) => val topic = file.split("/").takeRight(2).head
> newsgroupsMap(topic) }
> I got an error: "value newsgroupsMap is not a member of String"
> 
> Other relevant part of the code:
> val path = "/PATH/20news-bydate-train/*"
> val rdd = sc.wholeTextFiles(path) 
> val newsgroups = rdd.map { case (file, text) => 
> file.split("/").takeRight(2).head }
> 
> val tf = hashingTF.transform(tokens)
> val idf = new IDF().fit(tf)
> val tfidf = idf.transform(tf)
> 
> val newsgroupsMap = newsgroups.distinct.collect().zipWithIndex.toMap
> val zipped = newsgroups.zip(tfidf)
> val train = zipped.map { case (topic, vector) 
> =>LabeledPoint(newsgroupsMap(topic), vector) }
> train.cache
> 
> val model = NaiveBayes.train(train, lambda = 0.1)
> 
> val testPath = "/PATH//20news-bydate-test/*"
> val testRDD = sc.wholeTextFiles(testPath)
> val testLabels = testRDD.map { case (file, text) => val topic = 
> file.split("/").takeRight(2).head newsgroupsMap(topic) }
> 
> I attached the whole program code. 
> Can anyone help, what the problem is?
> 
> Regards,
> Zsombor
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org 
> <mailto:user-unsubscr...@spark.apache.org>
> For additional commands, e-mail: user-h...@spark.apache.org 
> <mailto:user-h...@spark.apache.org>
>

Re: Machine learning with spark (book code example error)

Reply via email to