Hi,
If i don't make a class Serializable (... extends Serializable) will it run
distributed with executors or it will only run on master machine?
Thanks
Hello
I have a question about AWS CLI for people who use it.
I create a spark cluster with aws cli and i’m using spark step with jar
dependencies. But as you can see below i can not set multiple jars because AWS
CLI replaces comma with space in ARGS.
Is there a way of doing it? I can accept
"day(date) day",
"hour(date) hour",
"concat_ws('-',
Hello,
I’m running some spark sql queries with registerTempTable function. But i get
below error:
org.apache.spark.sql.AnalysisException: resolved attribute(s)
day#1680,year#1678,dt#1682,month#1679,hour#1681 missing from
Hello,
It worked like a charm. Thank you very much.
Some userid’s were null that’s why many records go to userid ’null’. When i put
a where clause: userid != ‘null’, it solved problem.
> On 24 Sep 2015, at 22:43, java8964 wrote:
>
> I can understand why your first query
How can i convert a Vector to RDD[Double]. For example:
val vector = Vectors.dense(1.0,2.0)
val rdd // i need sc.parallelize(Array(1.0,2.0))
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands,
@Jingyu
Yes, it works without regex and concatenation as the query below:
So, what we can understand from this? Because when i do like that, shuffle read
sizes are equally distributed between partitions.
val usersInputDF = sqlContext.sql(
s"""
| select userid from landing where
Yes right, the query you wrote worked in same cluster. In this case, partitions
were equally distributed but when i used regex and concetanations it’s not as i
said before. Query with concetanation is below:
val usersInputDF = sqlContext.sql(
s"""
| select userid,concat_ws('
Thank you very much. This makes sense. I will write after try your solution.
> On 24 Sep 2015, at 22:43, java8964 wrote:
>
> I can understand why your first query will finish without OOM, but the new
> one will fail with OOM.
>
> In the new query, you are asking a
Yes, it’s possible. I use S3 as data source. My external tables has
partitioned. Belowed task is 193/200. Job has 2 stages and its 193. task of 200
in 2.stage because of sql.shuffle.partitions.
How can i avoid this situation, this is my query:
select userid,concat_ws('
I run the code below and getting error:
val dateUtil = new DateUtil()
val usersInputDF = sqlContext.sql(
s"""
| select userid,concat_ws(' ',collect_list(concat_ws(' ',if(productname
is not
? Your query is trying to
create a table and persist the metadata of the table in metastore, which is
only supported by HiveContext.
On Wed, Aug 19, 2015 at 8:44 AM, Yusuf Can Gürkan yu...@useinsider.com
mailto:yu...@useinsider.com wrote:
Hello,
I’m trying to create a table
12 matches
Mail list logo