Forgot to mention. Here is how I run the program : ./bin/spark-submit --conf "spark.app.master"="local[1]" ~/workspace/spark-python/ApacheLogWebServerAnalysis.py
On Wednesday, 12 August 2015 10:28 AM, Spark Enthusiast <sparkenthusi...@yahoo.in> wrote: I wrote a small python program : def parseLogs(self): """ Read and parse log file """ self._logger.debug("Parselogs() start") self.parsed_logs = (self._sc .textFile(self._logFile) .map(self._parseApacheLogLine) .cache()) self.access_logs = (self.parsed_logs .filter(lambda s: s[1] == 1) .map(lambda s: s[0]) .cache()) self.failed_logs = (self.parsed_logs .filter(lambda s: s[1] == 0) .map(lambda s: s[0])) failed_logs_count = self.failed_logs.count() if failed_logs_count > 0: self._logger.debug('Number of invalid logline: %d' % self.failed_logs.count()) for line in self.failed_logs.take(20): self._logger.debug('Invalid logline: %s' % line) self._logger.debug('Read %d lines, successfully parsed %d lines, failed to parse %d lines' % \ (self.parsed_logs.count(), self.access_logs.count(), self.failed_logs.count())) return (self.parsed_logs, self.access_logs, self.failed_logs) def main(argv): try: logger = createLogger("pyspark", logging.DEBUG, "LogAnalyzer.log", "./") logger.debug("Starting LogAnalyzer") myLogAnalyzer = ApacheLogAnalyzer(logger) (parsed_logs, access_logs, failed_logs) = myLogAnalyzer.parseLogs() except Exception as e: print "Encountered Exception %s" %str(e) logger.debug('Read %d lines, successfully parsed %d lines, failed to parse %d lines' % (parsed_logs.count(), access_logs.count(), failed_logs.count())) logger.info("DONE. ALL TESTS PASSED") I see some log messages:"Starting LogAnalyzer""Parselogs() start""DONE. ALL TESTS PASSED" But do not see some log messages:Read %d lines, successfully parsed %d lines, failed to parse %d lines' But, This line:logger.debug('Read %d lines, successfully parsed %d lines, failed to parse %d lines' % (parsed_logs.count(), access_logs.count(), failed_logs.count()))I get the following error : Encountered Exception Cannot pickle files that are not opened for reading Do not have a clue as to what's happening. Any help will be appreciated.