Can one of the Scala experts please explain this bit of pattern magic from
the Spark ML tutorial: _._2.user ?
As near as I can tell, this is applying the _2 function to the wildcard, and
then applying the Œuser¹ function to that. In a similar way the Œproduct¹
function is applied in the next line, yet these functions don¹t seem to
exist anywhere in the project, nor are they used anywhere else in the code.
It almost makes sense, but not quite. Code below:
val ratings = sc.textFile(new File(movieLensHomeDir,
"ratings.dat").toString).map { line =>
val fields = line.split("::")
// format: (timestamp % 10, Rating(userId, movieId, rating))
(fields(3).toLong % 10, Rating(fields(0).toInt, fields(1).toInt,
fields(2).toDouble))
}
Š
val numRatings = ratings.count
val numUsers = ratings.map(_._2.user).distinct.count
val numMovies = ratings.map(_._2.product).distinct.count
Cheers,
- Steve Nunez
--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.