[jira] [Created] (SPARK-12954) pyspark API 1.3.0 how we can patitionning by columns

malouke (JIRA) Thu, 21 Jan 2016 04:32:24 -0800

malouke created SPARK-12954:
-------------------------------

             Summary: pyspark API 1.3.0  how we can patitionning by columns  
                 Key: SPARK-12954
                 URL: https://issues.apache.org/jira/browse/SPARK-12954
             Project: Spark
          Issue Type: Bug
          Components: PySpark
    Affects Versions: 1.3.0
         Environment: spark 1.3.0
cloudera manger 
linux platfrome
pyspark  
            Reporter: malouke
            Priority: Blocker



hi,
before posting this question i try lot of things , but i dont found solution.

i have 9 table and i join thems with two ways:
 -1 first test with df.join(df2, df.id == df.id2,'left_outer')
-2 sqlcontext.sql("select * from t1 left join t2 on  id_t1=id_t2")

after that i want  partition by date the result of join :
-in pyspark 1.5.2 i try partitionBy if table it's not comming from result of at 
most two tables evry thiings ok. but when i  join more than three tables i dont 
have result after severals hours .
- in pyspark 1.3.0 i dont found in api one function let me  partition by dat 
columns 


Q: some one can help me to resolve this probleme  
thank you in advance 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-12954) pyspark API 1.3.0 how we can patitionning by columns

Reply via email to