paul-rogers commented on PR #2937: URL: https://github.com/apache/drill/pull/2937#issuecomment-2322784235
> @paul-rogers I'm not aware of any recent significant work on the Parquet reader. I know @jnturton did some work regarding adding new compression capabilities and there have been a few bug fixes here and there, but nothing major as I recall. So I don't think we've added any Parquet "prescan" that I am aware of. Ah. I'm misunderstanding something. I outlined a number of the common problems we encounter when we try to figure out schema dynamically. A response suggested that this pull request solves this because it has access to the full Parquet schema in the planner. The only way to get schema that is either to use the old Parquet metadata cache, or something that scans all the Parquet files in the planner. I thought I saw a statement that such a scan was being done. To prevent further confusion, what is the source of the Parquet schema in this fix? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org