[ 
https://issues.apache.org/jira/browse/ARROW-18210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17627001#comment-17627001
 ] 

Antoine Pitrou edited comment on ARROW-18210 at 11/1/22 8:21 AM:
-----------------------------------------------------------------

I see. I don't think you can expect excellent performance from 
{{{}StreamWriter{}}}. Parquet is a columnar format, so you should feed the data 
column-wise rather than row-wise. Take a look at the {{TypedColumnWriter}} 
class and ensure you write data in batches.


was (Author: pitrou):
I see. I don't think you can expect excellent performance from StreamWriter. 
Parquet is a columnar format, so you should feed the data column-wise rather 
than row-wise. Take a look at the {{TypedColumnWriter}} and ensure you write 
data in batches.

> [C++][Parquet] Skip check in StreamWriter
> -----------------------------------------
>
>                 Key: ARROW-18210
>                 URL: https://issues.apache.org/jira/browse/ARROW-18210
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++, Parquet
>    Affects Versions: 10.0.0
>            Reporter: Madhur
>            Priority: Major
>
> Currently StreamWriter is slower only because of checking of columns, if we 
> allow customization option (maybe ctor arg) to skip the check then 
> StreamWriter can be more efficient?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to