[jira] [Commented] (SPARK-12954) pyspark API 1.3.0 how we can patitionning by columns

2016-01-21 Thread malouke (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110596#comment-15110596
 ] 

malouke commented on SPARK-12954:
-

ok sorry,

> pyspark API 1.3.0  how we can patitionning by columns  
> ---
>
> Key: SPARK-12954
> URL: https://issues.apache.org/jira/browse/SPARK-12954
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 1.3.0
> Environment: spark 1.3.0
> cloudera manger 
> linux platfrome
> pyspark  
>Reporter: malouke
>Priority: Blocker
>  Labels: documentation, features, performance, test
>
> hi,
> before posting this question i try lot of things , but i dont found solution.
> i have 9 table and i join thems with two ways:
>  -1 first test with df.join(df2, df.id == df.id2,'left_outer')
> -2 sqlcontext.sql("select * from t1 left join t2 on  id_t1=id_t2")
> after that i want  partition by date the result of join :
> -in pyspark 1.5.2 i try partitionBy if table it's not comming from result of 
> at most two tables evry thiings ok. but when i  join more than three tables i 
> dont have result after severals hours .
> - in pyspark 1.3.0 i dont found in api one function let me  partition by dat 
> columns 
> Q: some one can help me to resolve this probleme  
> thank you in advance 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-12954) pyspark API 1.3.0 how we can patitionning by columns

2016-01-21 Thread malouke (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-12954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110603#comment-15110603
 ] 

malouke commented on SPARK-12954:
-

hi sean,
where i can ask question ?

> pyspark API 1.3.0  how we can patitionning by columns  
> ---
>
> Key: SPARK-12954
> URL: https://issues.apache.org/jira/browse/SPARK-12954
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 1.3.0
> Environment: spark 1.3.0
> cloudera manger 
> linux platfrome
> pyspark  
>Reporter: malouke
>Priority: Blocker
>  Labels: documentation, features, performance, test
>
> hi,
> before posting this question i try lot of things , but i dont found solution.
> i have 9 table and i join thems with two ways:
>  -1 first test with df.join(df2, df.id == df.id2,'left_outer')
> -2 sqlcontext.sql("select * from t1 left join t2 on  id_t1=id_t2")
> after that i want  partition by date the result of join :
> -in pyspark 1.5.2 i try partitionBy if table it's not comming from result of 
> at most two tables evry thiings ok. but when i  join more than three tables i 
> dont have result after severals hours .
> - in pyspark 1.3.0 i dont found in api one function let me  partition by dat 
> columns 
> Q: some one can help me to resolve this probleme  
> thank you in advance 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org