[jira] [Commented] (HADOOP-18187) Convert s3a prefetching to use JavaDoc for fields and enums

2022-04-27 Thread Bhalchandra Pandit (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17528853#comment-17528853
 ] 

Bhalchandra Pandit commented on HADOOP-18187:
-

Thanks for the links. I will take care of the comments.

> Convert s3a prefetching to use JavaDoc for fields and enums
> ---
>
> Key: HADOOP-18187
> URL: https://issues.apache.org/jira/browse/HADOOP-18187
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Daniel Carl Jones
>Assignee: Bhalchandra Pandit
>Priority: Minor
>
> There's lots of good comments for fields and enum values in the current code. 
> Let's expose these to your IDE with JavaDoc instead.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18187) Convert s3a prefetching to use JavaDoc for fields and enums

2022-04-26 Thread Bhalchandra Pandit (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17528442#comment-17528442
 ] 

Bhalchandra Pandit commented on HADOOP-18187:
-

[~dannycjones]  do you have a reference source file whose comment style I can 
use?

> Convert s3a prefetching to use JavaDoc for fields and enums
> ---
>
> Key: HADOOP-18187
> URL: https://issues.apache.org/jira/browse/HADOOP-18187
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Daniel Carl Jones
>Assignee: Bhalchandra Pandit
>Priority: Minor
>
> There's lots of good comments for fields and enum values in the current code. 
> Let's expose these to your IDE with JavaDoc instead.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-18187) Convert s3a prefetching to use JavaDoc for fields and enums

2022-04-26 Thread Bhalchandra Pandit (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhalchandra Pandit reassigned HADOOP-18187:
---

Assignee: Bhalchandra Pandit

> Convert s3a prefetching to use JavaDoc for fields and enums
> ---
>
> Key: HADOOP-18187
> URL: https://issues.apache.org/jira/browse/HADOOP-18187
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Daniel Carl Jones
>Assignee: Bhalchandra Pandit
>Priority: Minor
>
> There's lots of good comments for fields and enum values in the current code. 
> Let's expose these to your IDE with JavaDoc instead.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-18182) S3File to store reference to active S3Object in a field.

2022-04-26 Thread Bhalchandra Pandit (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17528440#comment-17528440
 ] 

Bhalchandra Pandit commented on HADOOP-18182:
-

[~ste...@apache.org] I did not understand your comment : "as long as the s3 
object reference lifespan is > than that of the stream's use, all is well.".

Can you please clarify?

> S3File to store reference to active S3Object in a field.
> 
>
> Key: HADOOP-18182
> URL: https://issues.apache.org/jira/browse/HADOOP-18182
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Bhalchandra Pandit
>Priority: Major
>
> HADOOP-17338 showed us how recent {{S3Object.finalize()}} can call 
> stream.close() and so close an active stream if a GC happens during a read. 
> replicate the same fix here.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-18182) S3File to store reference to active S3Object in a field.

2022-04-26 Thread Bhalchandra Pandit (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhalchandra Pandit reassigned HADOOP-18182:
---

Assignee: Bhalchandra Pandit

> S3File to store reference to active S3Object in a field.
> 
>
> Key: HADOOP-18182
> URL: https://issues.apache.org/jira/browse/HADOOP-18182
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Bhalchandra Pandit
>Priority: Major
>
> HADOOP-17338 showed us how recent {{S3Object.finalize()}} can call 
> stream.close() and so close an active stream if a GC happens during a read. 
> replicate the same fix here.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-18028) improve S3 read speed using prefetching & caching

2021-11-29 Thread Bhalchandra Pandit (Jira)
Bhalchandra Pandit created HADOOP-18028:
---

 Summary: improve S3 read speed using prefetching & caching
 Key: HADOOP-18028
 URL: https://issues.apache.org/jira/browse/HADOOP-18028
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs/s3
Reporter: Bhalchandra Pandit


I work for Pinterest. I developed a technique for vastly improving read 
throughput when reading from the S3 file system. It not only helps the 
sequential read case (like reading a SequenceFile) but also significantly 
improves read throughput of a random access case (like reading Parquet). This 
technique has been very useful in significantly improving efficiency of the 
data processing jobs at Pinterest. 
 
I would like to contribute that feature to Apache Hadoop. More details on this 
technique are available in this blog I wrote recently:
[https://medium.com/pinterest-engineering/improving-efficiency-and-reducing-runtime-using-s3-read-optimization-b31da4b60fa0]
 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17930) implement non-guava Precondition checkState

2021-09-23 Thread Bhalchandra Pandit (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17419313#comment-17419313
 ] 

Bhalchandra Pandit commented on HADOOP-17930:
-

If it helps, I have already added the required support via this pull request 
(to Thrift project).

[https://github.com/apache/thrift/pull/2439]

I will also include the same helper when I create pull request for adding S3 
read optimization in a few days.

> implement non-guava Precondition checkState
> ---
>
> Key: HADOOP-17930
> URL: https://issues.apache.org/jira/browse/HADOOP-17930
> Project: Hadoop Common
>  Issue Type: Sub-task
>Affects Versions: 3.4.0, 3.2.3, 3.3.2
>Reporter: Ahmed Hussein
>Assignee: Ahmed Hussein
>Priority: Major
>
> As part In order to replace Guava Preconditions, we need to implement our own 
> versions of the API.
>  This Jira is to add the implementation {{checkState}} to the existing class 
> {{org.apache.hadoop.util.Preconditions}}
> +The plan is as follows+
>  * implement {{org.apache.hadoop.util.Preconditions.checkState}} with the 
> minimum set of interface used in the current hadoop repo.
>  * we can replace {{guava.Preconditions}} by 
> {{org.apache.hadoop.util.Preconditions}} once all the interfaces have been 
> implemented (both this jira and HADOOP-17929 are complete).
>  * We need the change to be easily to be backported in 3.x.
> previous jiras:
>  * HADOOP-17126 was created to implement CheckNotNull.
>  * HADOOP-17929 implementing checkArgument.
> CC: [~ste...@apache.org], [~vjasani]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org