[jira] [Commented] (HUDI-7007) Integrate functional index using bloom filter on reader side

2024-04-09 Thread Vinaykumar Bhat (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-7007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17835348#comment-17835348
 ] 

Vinaykumar Bhat commented on HUDI-7007:
---

I am not sure if creating functional indexes (using col stats) works corerctly. 
For example, on creating a functional index on a table with three existing 
files fails to process all the files. The col-stats in the MDT is created for 
only one of the file. HUDI-7579 has more details.

> Integrate functional index using bloom filter on reader side
> 
>
> Key: HUDI-7007
> URL: https://issues.apache.org/jira/browse/HUDI-7007
> Project: Apache Hudi
>  Issue Type: Task
>Reporter: Sagar Sumit
>Assignee: Vinaykumar Bhat
>Priority: Major
>  Labels: hudi-1.0.0-beta2
> Fix For: 1.0.0
>
>
> Currently, one can create a functional index on a column using bloom filters. 
> However, only the one created using column stats is supported on the reader 
> side (check `FunctionalIndexSupport`). This ticket tracks the support for 
> using bloom filters on functional index in the reader path.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HUDI-7007) Integrate functional index using bloom filter on reader side

2024-03-27 Thread Vinaykumar Bhat (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-7007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17831645#comment-17831645
 ] 

Vinaykumar Bhat commented on HUDI-7007:
---

Seems like `
FunctionalIndexSupport::loadFunctionalIndexDataFrame(...)` is always called 
(from `HoodieFileIndex::lookupCandidateFilesInMetadataTable(...)` with an empty 
`indexPartition` string. So, it is likely that file pruning based on functional 
index is not supported.

> Integrate functional index using bloom filter on reader side
> 
>
> Key: HUDI-7007
> URL: https://issues.apache.org/jira/browse/HUDI-7007
> Project: Apache Hudi
>  Issue Type: Task
>Reporter: Sagar Sumit
>Assignee: Vinaykumar Bhat
>Priority: Major
>  Labels: hudi-1.0.0-beta2
> Fix For: 1.0.0
>
>
> Currently, one can create a functional index on a column using bloom filters. 
> However, only the one created using column stats is supported on the reader 
> side (check `FunctionalIndexSupport`). This ticket tracks the support for 
> using bloom filters on functional index in the reader path.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (HUDI-7007) Integrate functional index using bloom filter on reader side

2024-03-27 Thread Vinaykumar Bhat (Jira)


[ 
https://issues.apache.org/jira/browse/HUDI-7007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17831243#comment-17831243
 ] 

Vinaykumar Bhat commented on HUDI-7007:
---

[~codope] - need some pointers on this. Are there any tests that executes a 
query resulting in pruning files based on functional index? I saw 
`TestFunctionalIndex.scala`, but none of the tests there seem to have such a 
query.

> Integrate functional index using bloom filter on reader side
> 
>
> Key: HUDI-7007
> URL: https://issues.apache.org/jira/browse/HUDI-7007
> Project: Apache Hudi
>  Issue Type: Task
>Reporter: Sagar Sumit
>Assignee: Vinaykumar Bhat
>Priority: Major
>  Labels: hudi-1.0.0-beta2
> Fix For: 1.0.0
>
>
> Currently, one can create a functional index on a column using bloom filters. 
> However, only the one created using column stats is supported on the reader 
> side (check `FunctionalIndexSupport`). This ticket tracks the support for 
> using bloom filters on functional index in the reader path.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)