Introducing Spark User Group in Korea & Question on creating non-software goods (stickers)

2016-04-01 Thread Kevin (Sangwoo) Kim
Hi all! I'm Kevin, one of contributors of Spark and I'm organizing Spark User Group in Korea. We're having 2500 members in community, and it's even growing faster today. https://www.facebook.com/groups/sparkkoreauser/ -

Re: ClassNotFoundException

2015-03-16 Thread Kevin (Sangwoo) Kim
Hi Ralph, It seems like https://issues.apache.org/jira/browse/SPARK-6299 issue, which is I'm working on. I submitted a PR for it, would you test it? Regards, Kevin On Tue, Mar 17, 2015 at 1:11 AM Ralph Bergmann ra...@dasralph.de wrote: Hi, I want to try the JavaSparkPi example[1] on a

Re: Use of nscala-time within spark-shell

2015-02-17 Thread Kevin (Sangwoo) Kim
Great, or you can just use nscala-time with scala 2.10! On Tue Feb 17 2015 at 5:41:53 PM Hammam CHAMSI hscha...@hotmail.com wrote: Thanks Kevin for your reply, I downloaded the pre_built version and as you said the default spark scala version is 2.10. I'm now building spark 1.2.1 with scala

Re: Use of nscala-time within spark-shell

2015-02-17 Thread Kevin (Sangwoo) Kim
Then, why don't you use nscala-time_2.10-1.8.0.jar, not nscala-time_2.11-1.8.0.jar ? On Tue Feb 17 2015 at 5:55:50 PM Hammam CHAMSI hscha...@hotmail.com wrote: I can use nscala-time with scala, but my issue is that I can't use it witinh spark-shell console! It gives my the error below.

Re: Use of nscala-time within spark-shell

2015-02-16 Thread Kevin (Sangwoo) Kim
What is your scala version used to build Spark? It seems your nscala-time library scala version is 2.11, and default Spark scala version is 2.10. On Tue Feb 17 2015 at 1:51:47 AM Hammam CHAMSI hscha...@hotmail.com wrote: Hi All, Thanks in advance for your help. I have timestamp which I need

Re: Can spark job server be used to visualize streaming data?

2015-02-13 Thread Kevin (Sangwoo) Kim
it is not yet supported for CDH 5.3, and Spark 1.2. Please correct me if I am mistaken. On Thu, Feb 12, 2015 at 7:33 PM, Kevin (Sangwoo) Kim kevin...@apache.org wrote: Apache Zeppelin also has a scheduler and then you can reload your chart periodically, Check it out: http

Re: Can spark job server be used to visualize streaming data?

2015-02-12 Thread Kevin (Sangwoo) Kim
Apache Zeppelin also has a scheduler and then you can reload your chart periodically, Check it out: http://zeppelin.incubator.apache.org/docs/tutorial/tutorial.html On Fri Feb 13 2015 at 7:29:00 AM Silvio Fiorito silvio.fior...@granturing.com wrote: One method I’ve used is to publish each

Re: How to compute RDD[(String, Set[String])] that include large Set

2015-01-20 Thread Kevin (Sangwoo) Kim
Great to hear you got solution!! Cheers! Kevin On Wed Jan 21 2015 at 11:13:44 AM jagaximo takuya_seg...@dwango.co.jp wrote: Kevin (Sangwoo) Kim wrote If keys are not too many, You can do like this: val data = List( (A, Set(1,2,3)), (A, Set(1,2,4)), (B, Set(1,2,3)) ) val

Re: How to compute RDD[(String, Set[String])] that include large Set

2015-01-19 Thread Kevin (Sangwoo) Kim
In your code, you're doing combination of large sets, like (set1 ++ set2).size which is not a good idea. (rdd1 ++ rdd2).distinct is equivalent implementation and will compute in distributed manner. Not very sure your computation on key'd sets are feasible to be transformed into RDDs. Regards,

Re: How to compute RDD[(String, Set[String])] that include large Set

2015-01-19 Thread Kevin (Sangwoo) Kim
If keys are not too many, You can do like this: val data = List( (A, Set(1,2,3)), (A, Set(1,2,4)), (B, Set(1,2,3)) ) val rdd = sc.parallelize(data) rdd.persist() rdd.filter(_._1 == A).flatMap(_._2).distinct.count rdd.filter(_._1 == B).flatMap(_._2).distinct.count rdd.unpersist() == data:

ExceptionInInitializerError when using a class defined in REPL

2015-01-18 Thread Kevin (Sangwoo) Kim
Hi experts, I'm getting ExceptionInInitializerError when using a class defined in REPL. Code is something like this: case class TEST(a: String) sc.textFile(~~~).map(TEST(_)).count The code above used to works well until yesterday, but suddenly for some reason it doesn't work with the error.

Re: Futures timed out during unpersist

2015-01-17 Thread Kevin (Sangwoo) Kim
(Sangwoo) Kim kevin...@apache.org wrote: Hi experts, I got an error during unpersist RDD. Any ideas? java.util.concurrent.TimeoutException: Futures timed out after [30 seconds] at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219) at scala.concurrent.impl.Promise

Futures timed out during unpersist

2015-01-16 Thread Kevin (Sangwoo) Kim
Hi experts, I got an error during unpersist RDD. Any ideas? java.util.concurrent.TimeoutException: Futures timed out after [30 seconds] at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219) at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223) at