Re: improving efficiency and reducing runtime using S3 read optimization

2021-08-31 Thread Stolojan, Bogdan
proves read throughput of a random access case (like reading Parquet). > > This technique has been very useful in significantly improving efficiency > > of the data processing jobs at Pinterest. > > > > I would like to contribute that feature to Apache

Re: improving efficiency and reducing runtime using S3 read optimization

2021-08-29 Thread Bhalchandra Pandit
system. It not only helps the >> > sequential read case (like reading a SequenceFile) but also >> significantly >> > improves read throughput of a random access case (like reading Parquet). >> > This technique has been very useful in significantly improving >> ef

Re: improving efficiency and reducing runtime using S3 read optimization

2021-08-29 Thread Bhalchandra Pandit
> > improves read throughput of a random access case (like reading Parquet). > > This technique has been very useful in significantly improving efficiency > > of the data processing jobs at Pinterest. > > > > I would like to contribute that feature to Apache Hado

Re: improving efficiency and reducing runtime using S3 read optimization

2021-08-26 Thread Steve Loughran
rquet). > This technique has been very useful in significantly improving efficiency > of the data processing jobs at Pinterest. > > I would like to contribute that feature to Apache Hadoop. More details on > this technique are available in this blog I wrote recently: > > https://medium.c

Re: improving efficiency and reducing runtime using S3 read optimization

2021-08-25 Thread larry mccay
; this technique are available in this blog I wrote recently: > > https://medium.com/pinterest-engineering/improving-efficiency-and-reducing-runtime-using-s3-read-optimization-b31da4b60fa0 > > I would like to know if you believe it to be a useful contribution. If so, > I will follo

improving efficiency and reducing runtime using S3 read optimization

2021-08-25 Thread Bhalchandra Pandit
/improving-efficiency-and-reducing-runtime-using-s3-read-optimization-b31da4b60fa0 I would like to know if you believe it to be a useful contribution. If so, I will follow the steps outlined on the how to contribute <https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute> page. Kumar