Don’t recommend this code style, you’d better brace the function block.
val testLabels = testRDD.map { case (file, text) => { val topic = file.split("/").takeRight(2).head newsgroupsMap(topic) } } > On Oct 14, 2015, at 15:46, Nick Pentreath <nick.pentre...@gmail.com> wrote: > > Hi there. I'm the author of the book (thanks for buying it by the way :) > > Ideally if you're having any trouble with the book or code, it's best to > contact the publisher and submit a query > (https://www.packtpub.com/books/content/support/17400 > <https://www.packtpub.com/books/content/support/17400>) > > However, I can help with this issue. The problem is that the "testLabels" > code needs to be indented over multiple lines: > > val testPath = "/PATH/20news-bydate-test/*" > val testRDD = sc.wholeTextFiles(testPath) > val testLabels = testRDD.map { case (file, text) => > val topic = file.split("/").takeRight(2).head > newsgroupsMap(topic) > } > > As it is in the sample code attached. If you copy the whole indented block > (or line by line) into the console, it should work - I've tested all the > sample code again and indeed it works for me. > > Hope this helps > Nick > > On Tue, Oct 13, 2015 at 8:31 PM, Zsombor Egyed <egye...@starschema.net > <mailto:egye...@starschema.net>> wrote: > Hi! > > I was reading the ML with spark book, and I was very interested about the 9. > chapter (text mining), so I tried code examples. > > Everything was fine, but in this line: > val testLabels = testRDD.map { > case (file, text) => val topic = file.split("/").takeRight(2).head > newsgroupsMap(topic) } > I got an error: "value newsgroupsMap is not a member of String" > > Other relevant part of the code: > val path = "/PATH/20news-bydate-train/*" > val rdd = sc.wholeTextFiles(path) > val newsgroups = rdd.map { case (file, text) => > file.split("/").takeRight(2).head } > > val tf = hashingTF.transform(tokens) > val idf = new IDF().fit(tf) > val tfidf = idf.transform(tf) > > val newsgroupsMap = newsgroups.distinct.collect().zipWithIndex.toMap > val zipped = newsgroups.zip(tfidf) > val train = zipped.map { case (topic, vector) > =>LabeledPoint(newsgroupsMap(topic), vector) } > train.cache > > val model = NaiveBayes.train(train, lambda = 0.1) > > val testPath = "/PATH//20news-bydate-test/*" > val testRDD = sc.wholeTextFiles(testPath) > val testLabels = testRDD.map { case (file, text) => val topic = > file.split("/").takeRight(2).head newsgroupsMap(topic) } > > I attached the whole program code. > Can anyone help, what the problem is? > > Regards, > Zsombor > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > <mailto:user-unsubscr...@spark.apache.org> > For additional commands, e-mail: user-h...@spark.apache.org > <mailto:user-h...@spark.apache.org> >