Re: Does spark support something like the bind function in R?

2022-02-08 Thread ayan guha
Hi In python, or in general in spark, you can just "read" the files and select the column. I am assuming you are reading each file individually in separate dataframes and joining them. Instead, you can read all the files in single dataframe and select 1 column. On Wed, Feb 9, 2022 at 2:55 AM

Does spark support something like the bind function in R?

2022-02-08 Thread Andrew Davidson
I need to create a single table by selecting one column from thousands of files. The columns are all of the same type, have the same number of rows and rows names. I am currently using join. I get OOM on mega-mem cluster with 2.8 TB. Does spark have something like cbind() “Take a sequence of