GitHub user bchalk101 closed a discussion: Implementing a custom parquet index
Hi, I am trying to implement a custom index, using a `TableProvider` as suggested in the parquet index examples. I have columns in which users would like to do needle-in-the-haystack queries with low latency, so I would like to index those columns. I have implemented the `TableProvider` and basic indexing works. However, I also want the features built into `ListingTable`, specifically, hive partitioning. Is there a way to combine functionality? Beyond, this I am also getting an issue with running a count on the dataset using my custom `TableProvider`, with the following error: ``` [2025-02-24T19:38:39Z ERROR reader_service::datafusion_executor] Could not count dataset error: Internal error: Physical input schema should be the same as the one converted from logical input schema. Differences: . ``` without any differences being printed, any idea what may be causing this? Something specific that I may be missing from my `TableProvider`? Thnx GitHub link: https://github.com/apache/datafusion/discussions/14858 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
