Github user mehant commented on a diff in the pull request:
https://github.com/apache/drill/pull/140#discussion_r38548103
--- Diff:
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/WorkspaceSchemaFactory.java
---
@@ -321,8 +327,101 @@ public DrillTable create(String key) {
return null;
}
+ private FormatMatcher findMatcher(FileStatus file) {
+ FormatMatcher matcher = null;
+ try {
+ for (FormatMatcher m : dropFileMatchers) {
+ if (m.isFileReadable(fs, file)) {
+ return m;
+ }
+ }
+ } catch (IOException e) {
+ logger.debug("Failed to find format matcher for file: %s", file,
e);
+ }
+ return matcher;
+ }
+
@Override
public void destroy(DrillTable value) {
}
+
+ /**
+ * Check if the table contains homogenenous files that can be read by
Drill. Eg: parquet, json csv etc.
+ * However if it contains more than one of these formats or a totally
different file format that Drill cannot
+ * understand then we will raise an exception.
+ * @param key
+ * @return
+ * @throws IOException
+ */
+ private boolean isHomogeneous(String key) throws IOException {
--- End diff --
The only reason was to avoid the performance penalty for read that would be
ensued by these checks and simply going ahead optimistically and failing later
if we hit different formats.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---