Thanks first. It's a scala Object extending App.
2013/8/7 Ian O'Connell <[email protected]> > is your code is probably part of an object? the closure cleaner doesn't > attempt to pull in parts of objects. > > What does the code around your sketched section look like? > > > On Wed, Aug 7, 2013 at 6:47 AM, Han JU <[email protected]> wrote: > >> Hi, >> >> I'm just putting my hands on Spark and I wrote a simple job in scala. >> It sketches like: >> >> val TAB = "\t" >> >> val support = 2 >> >> val sc = new SparkContext(...) >> >> val raw = sc.textFile(...) >> >> val filtered = raw.map( >> >> line => { >> >> val lineSplit = line.split(TAB) // TAB is null and exception is >> thrown during the run >> ... >> >> }).filter( p => p._2 >= support) // support here is 0 during the run >> >> ... >> >> I run the sbt-assembly jar like "java -cp ..." on a standalone cluster, I >> found out that when referenced in the RDD transformation, the 2 values, TAB >> and support, are set to their default values. So TAB is null, and support >> is 0 and no longer "\t" and 2 as they are initialized above. >> >> If the same jar is run locally (MASTER is local or local[k] instead of >> spark://...) on the same input, it runs perfectly. The code also runs well >> in spark-shell on cluster. >> >> For the jar to run correctly on cluster, I have to hard code the string >> literal and the number in the RDD transformation part. >> >> It really seems to me a weird bug, maybe it has something to do with the >> sbt-assembly jar compilation? Some suggestions? >> >> Thanks. >> >> I'm using spark version 0.7.3 and scala 2.9.3. >> >> -- >> *JU Han* >> >> Software Engineer Intern @ KXEN Inc. >> UTC - Université de Technologie de Compiègne >> * **GI06 - Fouille de Données et Décisionnel* >> >> +33 0619608888 >> > > -- *JU Han* Software Engineer Intern @ KXEN Inc. UTC - Université de Technologie de Compiègne * **GI06 - Fouille de Données et Décisionnel* +33 0619608888
