After doing group, you can use mkstring on the data frame. Following is an 
example where are columns are concatenated with space as a separator.



scala> call_cdf.map(row => row.mkString(" ")).show(false)
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|value                                                                          
                                                                                
                                                                                
                                                                                
                                                      |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|1 AAAAAAAABAAAAAAA 1998-01-01 null null 2450997 NY Metro large 2325 1374075 
8AM-12AM Keith Cunningham 4 Matters may hear as; profita New, cold plants can 
put al Dante Cook 3 pri 4 ese 995 Park 3rd Dr. Suite 470 Five Points Ziebach 
County SD 56098 United States -6.0 0.02                                         
                                                              |
|2 AAAAAAAACAAAAAAA 1998-01-01 2000-12-31 null 2450876 Mid Atlantic large 4208 
837392 8AM-4PM Stephen Clem 3 Classes devote largely other, standard ter Free 
germans prove flatly industrial drugs. Low questions come to a equations. 
British, conservative Christopher Perez 6 cally 3 pri 245 Johnson  Circle Suite 
200 Fairview Williamson County TN 35709 United States -5.0 0.03|
|3 AAAAAAAACAAAAAAA 2001-01-01 null null 2450876 Mid Atlantic small 3251 837392 
8AM-4PM William Johnson 3 Classes devote largely other, standard ter Ridiculous 
requirements must not implement about pure values. Substances know powers. 
Political rel Derrick Burke 6 cally 3 pri 245 Johnson  Circle Suite 200 
Fairview Williamson County TN 35709 United States -5.0 0.03        |
|4 AAAAAAAAEAAAAAAA 1998-01-01 2000-01-01 null 2450872 North Midwest large 2596 
708708 8AM-4PM Lamont Greene 3 Events must find anyway  Great rates must ensure 
famous, other banks. As main goals get home as a  Marvin Dean 2 able 2 able 927 
Oak Main ST Suite 150 Five Points Williamson County TN 36098 United States -5.0 
0.03                                                  |
|5 AAAAAAAAEAAAAAAA 2000-01-02 2001-12-31 null 2450872 North Midwest medium 
2596 708708 8AM-12AM Lamont Greene 3 Events must find anyway  So fresh supplies 
keep meanwhile religious, labour years. Rapid, careful subject Matthew Williams 
2 able 1 able 927 Oak Main ST Suite 150 Five Points Williamson County TN 36098 
United States -5.0 0.0                                     |
|6 AAAAAAAAEAAAAAAA 2002-01-01 null null 2450872 North Midwest small 2596 
708708 8AM-4PM Emilio Romano 6 As well novel sentences check through the plans. 
Sophisticated cities fall for e William Johnson 5 anti 1 able 927 Oak Main ST 
Suite 150 Five Points Williamson County TN 36098 United States -5.0 0.07        
                                                              |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+


+-----------------+-----------------+-----------------+---------------+-----------------+---------------+-------------+--------+------------+--------+--------+----------------+---------+--------------------+--------------------+-----------------+-----------+----------------+----------+---------------+----------------+--------------+--------------+---------------+-----------+-----------------+--------+------+-------------+-------------+-----------------+
|cc_call_center_sk|cc_call_center_id|cc_rec_start_date|cc_rec_end_date|cc_closed_date_sk|cc_open_date_sk|
      cc_name|cc_class|cc_employees|cc_sq_ft|cc_hours|      
cc_manager|cc_mkt_id|        cc_mkt_class|         
cc_mkt_desc|cc_market_manager|cc_division|cc_division_name|cc_company|cc_company_name|cc_street_number|cc_street_name|cc_street_type|cc_suite_number|
    cc_city|        cc_county|cc_state|cc_zip|   
cc_country|cc_gmt_offset|cc_tax_percentage|
+-----------------+-----------------+-----------------+---------------+-----------------+---------------+-------------+--------+------------+--------+--------+----------------+---------+--------------------+--------------------+-----------------+-----------+----------------+----------+---------------+----------------+--------------+--------------+---------------+-----------+-----------------+--------+------+-------------+-------------+-----------------+
|                1| AAAAAAAABAAAAAAA|       1998-01-01|           null|         
    null|        2450997|     NY Metro|   large|        2325| 
1374075|8AM-12AM|Keith Cunningham|        4|Matters may hear ...|New, cold 
plants ...|       Dante Cook|          3|             pri|         4|           
 ese|             995|      Park 3rd|           Dr.|      Suite 470|Five 
Points|   Ziebach County|      SD| 56098|United States|         -6.0|           
  0.02|
|                2| AAAAAAAACAAAAAAA|       1998-01-01|     2000-12-31|         
    null|        2450876| Mid Atlantic|   large|        4208|  837392| 8AM-4PM| 
   Stephen Clem|        3|Classes devote la...|Free germans prov...|Christopher 
Perez|          6|           cally|         3|            pri|             245| 
     Johnson |        Circle|      Suite 200|   Fairview|Williamson County|     
 TN| 35709|United States|         -5.0|             0.03|
|                3| AAAAAAAACAAAAAAA|       2001-01-01|           null|         
    null|        2450876| Mid Atlantic|   small|        3251|  837392| 8AM-4PM| 
William Johnson|        3|Classes devote la...|Ridiculous requir...|    Derrick 
Burke|          6|           cally|         3|            pri|             245| 
     Johnson |        Circle|      Suite 200|   Fairview|Williamson County|     
 TN| 35709|United States|         -5.0|             0.03|
|                4| AAAAAAAAEAAAAAAA|       1998-01-01|     2000-01-01|         
    null|        2450872|North Midwest|   large|        2596|  708708| 8AM-4PM| 
  Lamont Greene|        3|Events must find ...|Great rates must ...|      
Marvin Dean|          2|            able|         2|           able|            
 927|      Oak Main|            ST|      Suite 150|Five Points|Williamson 
County|      TN| 36098|United States|         -5.0|             0.03|
|                5| AAAAAAAAEAAAAAAA|       2000-01-02|     2001-12-31|         
    null|        2450872|North Midwest|  medium|        2596|  708708|8AM-12AM| 
  Lamont Greene|        3|Events must find ...|So fresh supplies...| Matthew 
Williams|          2|            able|         1|           able|             
927|      Oak Main|            ST|      Suite 150|Five Points|Williamson 
County|      TN| 36098|United States|         -5.0|              0.0|
|                6| AAAAAAAAEAAAAAAA|       2002-01-01|           null|         
    null|        2450872|North Midwest|   small|        2596|  708708| 8AM-4PM| 
  Emilio Romano|        6|As well novel sen...|Sophisticated cit...|  William 
Johnson|          5|            anti|         1|           able|             
927|      Oak Main|            ST|      Suite 150|Five Points|Williamson 
County|      TN| 36098|United States|         -5.0|             0.07|
+-----------------+-----------------+-----------------+---------------+-----------------+---------------+-------------+--------+------------+--------+--------+----------------+---------+--------------------+--------------------+-----------------+-----------+----------------+----------+---------------+----------------+--------------+--------------+---------------+-----------+-----------------+--------+------+-------------+-------------+-----------------+

From: Somasundaram Sekar [mailto:[email protected]]
Sent: Sunday, October 08, 2017 5:30 PM
To: [email protected]
Subject: Equivalent of Redshift ListAgg function in Spark (Pyspak)

Hi,

I want to concat multiple columns into a single column after grouping the  
DataFrame,

I want an functional equivalent of Redshift ListAgg function

pg_catalog.Listagg(column, '|')
         within GROUP( ORDER BY column) AS
name

LISTAGG Function
: For each group in a query, the LISTAGG aggregate function orders the rows for 
that group according to the ORDER BY expression, then concatenates the values 
into a single string.
DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the 
property of Persistent Systems Ltd. It is intended only for the use of the 
individual or entity to which it is addressed. If you are not the intended 
recipient, you are not authorized to read, retain, copy, print, distribute or 
use this message. If you have received this communication in error, please 
notify the sender and delete all copies of this message. Persistent Systems 
Ltd. does not accept any liability for virus infected mails.

Reply via email to