Re: spark.sql.autoBroadcastJoinThreshold not taking effect

2019-05-13 Thread Lantao Jin
Maybe you could try “--conf spark.sql.statistics.fallBackToHdfs=true"

On 2019/05/11 01:54:27, V0lleyBallJunki3  wrote: 
> Hello,> 
>I have set spark.sql.autoBroadcastJoinThreshold=1GB and I am running the> 
> spark job. However, my application is failing with:> 
> 
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)> 
>   at> 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)>
>  
>   at> 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)>
>  
>   at java.lang.reflect.Method.invoke(Method.java:498)> 
>   at> 
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:678)>
>  
> Caused by: org.apache.spark.SparkException: Cannot broadcast the table that> 
> is larger than 8GB: 8 GB> 
>   at> 
> org.apache.spark.sql.execution.exchange.BroadcastExchangeExec$$anonfun$relationFuture$1$$anonfun$apply$1.apply(BroadcastExchangeExec.scala:103)>
>  
>   at> 
> org.apache.spark.sql.execution.exchange.BroadcastExchangeExec$$anonfun$relationFuture$1$$anonfun$apply$1.apply(BroadcastExchangeExec.scala:76)>
>  
>   at> 
> org.apache.spark.sql.execution.SQLExecution$$anonfun$withExecutionId$1.apply(SQLExecution.scala:101)>
>  
>   at> 
> org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)>
>  
>   at> 
> org.apache.spark.sql.execution.SQLExecution$.withExecutionId(SQLExecution.scala:98)>
>  
>   at> 
> org.apache.spark.sql.execution.exchange.BroadcastExchangeExec$$anonfun$relationFuture$1.apply(BroadcastExchangeExec.scala:75)>
>  
>   at> 
> org.apache.spark.sql.execution.exchange.BroadcastExchangeExec$$anonfun$relationFuture$1.apply(BroadcastExchangeExec.scala:75)>
>  
>   at> 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)>
>  
>   at> 
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)> 
>   at> 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)>
>  
>   at> 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)>
>  
>   at java.lang.Thread.run(Thread.java:748)> 
> 
> When I am running with a limit 1 GB how can I hit the 8 GB limit? I made> 
> sure in the Spark History Server as well by printing out the value of> 
> spark.sql.autoBroadcastJoinThreshold that the value is correctly set and> 
> explain plan also shows that it is trying to do a Broadcast Join. Any ideas? 
> > 
> 
> 
> 
> --> 
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/> 
> 
> -> 
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org> 
> 
> 
-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



spark.sql.autoBroadcastJoinThreshold not taking effect

2019-05-10 Thread V0lleyBallJunki3
Hello,
   I have set spark.sql.autoBroadcastJoinThreshold=1GB and I am running the
spark job. However, my application is failing with:

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:678)
Caused by: org.apache.spark.SparkException: Cannot broadcast the table that
is larger than 8GB: 8 GB
at
org.apache.spark.sql.execution.exchange.BroadcastExchangeExec$$anonfun$relationFuture$1$$anonfun$apply$1.apply(BroadcastExchangeExec.scala:103)
at
org.apache.spark.sql.execution.exchange.BroadcastExchangeExec$$anonfun$relationFuture$1$$anonfun$apply$1.apply(BroadcastExchangeExec.scala:76)
at
org.apache.spark.sql.execution.SQLExecution$$anonfun$withExecutionId$1.apply(SQLExecution.scala:101)
at
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
at
org.apache.spark.sql.execution.SQLExecution$.withExecutionId(SQLExecution.scala:98)
at
org.apache.spark.sql.execution.exchange.BroadcastExchangeExec$$anonfun$relationFuture$1.apply(BroadcastExchangeExec.scala:75)
at
org.apache.spark.sql.execution.exchange.BroadcastExchangeExec$$anonfun$relationFuture$1.apply(BroadcastExchangeExec.scala:75)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

When I am running with a limit 1 GB how can I hit the 8 GB limit? I made
sure in the Spark History Server as well by printing out the value of
spark.sql.autoBroadcastJoinThreshold that the value is correctly set and
explain plan also shows that it is trying to do a Broadcast Join. Any ideas? 



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org