[ 
https://issues.apache.org/jira/browse/HUDI-686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17064472#comment-17064472
 ] 

sivabalan narayanan commented on HUDI-686:
------------------------------------------

Interesting impl [~vinoth] . Some initial thoughts.
 * Wrt candidates, I don't think we might run into OOM as its bounded to one 
partition. 
 * May I know why we need external spillableMap? why can't we use regular map. 
I don't know the benefits of external spillable map if all entries could be 
held in memory. Here too, one executor will have to hold at max all file infos 
for one partition only right? So, memory is bounded here too in my 
understanding. 

 

> Implement BloomIndexV2 that does not depend on memory caching
> -------------------------------------------------------------
>
>                 Key: HUDI-686
>                 URL: https://issues.apache.org/jira/browse/HUDI-686
>             Project: Apache Hudi (incubating)
>          Issue Type: Improvement
>          Components: Index, Performance
>            Reporter: Vinoth Chandar
>            Assignee: Vinoth Chandar
>            Priority: Major
>             Fix For: 0.6.0
>
>         Attachments: Screen Shot 2020-03-19 at 10.15.10 AM.png, Screen Shot 
> 2020-03-19 at 10.15.10 AM.png, Screen Shot 2020-03-19 at 10.15.10 AM.png, 
> image-2020-03-19-10-17-43-048.png
>
>
> Main goals here is to provide a much simpler index, without advanced 
> optimizations like auto tuned parallelism/skew handling but a better 
> out-of-experience for small workloads. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to