Could this be related to https://issues.apache.org/jira/browse/SPARK-17733 ?




------------------ Original ------------------
From:  "Cheng Lian-3 [via Apache Spark Developers 
List]";<ml-node+s1001551n2105...@n3.nabble.com>;
Send time: Thursday, Feb 23, 2017 9:43 AM
To: "Stan Zhai"<m...@zhaishidan.cn>; 

Subject:  Re: The driver hangs at DataFrame.rdd in Spark 2.1.0



                           
Just from the thread dump you provided, it seems that this       particular 
query plan jams our optimizer. However, it's also       possible that the 
driver just happened to be running optimizer       rules at that particular 
time point.
     
     
Since query planning doesn't touch any actual data, could you       please try 
to minimize this query by replacing the actual       relations with temporary 
views derived from Scala local       collections? In this way, it would be much 
easier for others to       reproduce issue.
     
Cheng
     
     
     On 2/22/17 5:16 PM, Stan Zhai wrote:
     
            Thanks for lian's reply.
       
       
       Here is the QueryPlan generated by Spark 1.6.2(I can't get it         in 
Spark 2.1.0):
                ...       
        
                
         
         ------------------ Original ------------------
                    Subject:  Re: The driver hangs at DataFrame.rdd             
in Spark 2.1.0
         
         
         
         
What is the query plan? We had once observed query plans that           grow 
exponentially in iterative ML workloads and the query           planner hangs 
forever. For example, each iteration combines 4           plan trees of the 
last iteration and forms a larger plan tree.           The size of the plan 
tree can easily reach billions of nodes           after 15 iterations.
         
         
         On 2/22/17 9:29 AM, Stan Zhai           wrote:
         
                    Hi all,
           
           
           The driver hangs at DataFrame.rdd in Spark 2.1.0 when the            
 DataFrame(SQL) is complex, Following thread dump of my             driver:
           ...
                  
       
          
                                
        
        
                        If you reply to this email, your message will be added 
to the discussion below:
                
http://apache-spark-developers-list.1001551.n3.nabble.com/Re-The-driver-hangs-at-DataFrame-rdd-in-Spark-2-1-0-tp21052p21053.html
        
                        To start a new topic under Apache Spark Developers 
List, email ml-node+s1001551n1...@n3.nabble.com 
                To unsubscribe from Apache Spark Developers List, click here.
                NAML



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Re-The-driver-hangs-at-DataFrame-rdd-in-Spark-2-1-0-tp21052p21054.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

Reply via email to