paul-rogers commented on issue #1618: DRILL-6950: Row set-based scan framework URL: https://github.com/apache/drill/pull/1618#issuecomment-466716112 Seems we have a number of ambiguous issues to resolve around implicit columns: * Their type. * Their scope. Type: The CSV reader (and any reader that uses ScanBatch) defines file metadata (AKA implicit) columns as Nullable VARCHAR. The new framework defines them as required VARCHAR. One can argue that the required mode is a) more correct, and b) more efficient. Scope: The CSV reader today does not define the partition columns for a class-path reader. Their type is nullable INT if used (because they are undefined.) The revised framework always has the partition columns available if the file metadata columns are available. The columns are of type Nullable VARCHAR (since, if a partition does not exist, it is set to null.) My recommendation is to a) ensure we have plenty of tests of current behavior, and b) go with the new behavior as we switch to the new readers. We should, however, debate this suggestion to ensure everyone agrees.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
