N Gautam Animesh created ARROW-17796:
----------------------------------------

             Summary: Using cbind when merging multi datasets using 
open_dataset on a directory.
                 Key: ARROW-17796
                 URL: https://issues.apache.org/jira/browse/ARROW-17796
             Project: Apache Arrow
          Issue Type: Task
            Reporter: N Gautam Animesh


I was wondering if we can use cbind stating particular column names when 
merging multi datasets using open_dataset(), so that we can bind only those 
particular cols.

I was using open_dataset to read multi datasets in a particular directory and 
wanted to merge  these multi datasets based on some particular columns that are 
common to all the datasets.

Is it possible to merge these datasets column wise, since by default 
open_dataset is merging all the datasets one after the other row-wise?

Do let me know if there's anything like this or any other work around.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to