Does it run distributed if class not Serializable

2016-09-09 Thread Yusuf Can Gürkan
Hi, If i don't make a class Serializable (... extends Serializable) will it run distributed with executors or it will only run on master machine? Thanks

AWS CLI --jars comma problem

2015-12-03 Thread Yusuf Can Gürkan
Hello I have a question about AWS CLI for people who use it. I create a spark cluster with aws cli and i’m using spark step with jar dependencies. But as you can see below i can not set multiple jars because AWS CLI replaces comma with space in ARGS. Is there a way of doing it? I can accept

Re: org.apache.spark.sql.AnalysisException with registerTempTable

2015-10-15 Thread Yusuf Can Gürkan
"day(date) day", "hour(date) hour", "concat_ws('-',

org.apache.spark.sql.AnalysisException with registerTempTable

2015-10-15 Thread Yusuf Can Gürkan
Hello, I’m running some spark sql queries with registerTempTable function. But i get below error: org.apache.spark.sql.AnalysisException: resolved attribute(s) day#1680,year#1678,dt#1682,month#1679,hour#1681 missing from

Re: Java Heap Space Error

2015-09-25 Thread Yusuf Can Gürkan
Hello, It worked like a charm. Thank you very much. Some userid’s were null that’s why many records go to userid ’null’. When i put a where clause: userid != ‘null’, it solved problem. > On 24 Sep 2015, at 22:43, java8964 wrote: > > I can understand why your first query

Convert Vector to RDD[Double]

2015-09-25 Thread Yusuf Can Gürkan
How can i convert a Vector to RDD[Double]. For example: val vector = Vectors.dense(1.0,2.0) val rdd // i need sc.parallelize(Array(1.0,2.0)) - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands,

Re: Java Heap Space Error

2015-09-24 Thread Yusuf Can Gürkan
@Jingyu Yes, it works without regex and concatenation as the query below: So, what we can understand from this? Because when i do like that, shuffle read sizes are equally distributed between partitions. val usersInputDF = sqlContext.sql( s""" | select userid from landing where

Re: Java Heap Space Error

2015-09-24 Thread Yusuf Can Gürkan
Yes right, the query you wrote worked in same cluster. In this case, partitions were equally distributed but when i used regex and concetanations it’s not as i said before. Query with concetanation is below: val usersInputDF = sqlContext.sql( s""" | select userid,concat_ws('

Re: Java Heap Space Error

2015-09-24 Thread Yusuf Can Gürkan
Thank you very much. This makes sense. I will write after try your solution. > On 24 Sep 2015, at 22:43, java8964 wrote: > > I can understand why your first query will finish without OOM, but the new > one will fail with OOM. > > In the new query, you are asking a

Re: Java Heap Space Error

2015-09-23 Thread Yusuf Can Gürkan
Yes, it’s possible. I use S3 as data source. My external tables has partitioned. Belowed task is 193/200. Job has 2 stages and its 193. task of 200 in 2.stage because of sql.shuffle.partitions. How can i avoid this situation, this is my query: select userid,concat_ws('

Heap Space Error

2015-09-22 Thread Yusuf Can Gürkan
I run the code below and getting error: val dateUtil = new DateUtil() val usersInputDF = sqlContext.sql( s""" | select userid,concat_ws(' ',collect_list(concat_ws(' ',if(productname is not

Re: SQLContext Create Table Problem

2015-08-19 Thread Yusuf Can Gürkan
? Your query is trying to create a table and persist the metadata of the table in metastore, which is only supported by HiveContext. On Wed, Aug 19, 2015 at 8:44 AM, Yusuf Can Gürkan yu...@useinsider.com mailto:yu...@useinsider.com wrote: Hello, I’m trying to create a table