[ https://issues.apache.org/jira/browse/DRILL-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15027443#comment-15027443 ]
Tomer Shiran commented on DRILL-4130: ------------------------------------- Maybe we should deprecate/remote the session variables and only have it as a SELECT option? Most other properties related to reading a file (field delimiter, extract CSV headers, etc.) are actually format options (which will be available as SELECT options soon), so I think having these session/system variables is inconsistent in the first place. Thoughts? > Ability to set settings at Table or View level rather than SESSION or SYSTEM > ---------------------------------------------------------------------------- > > Key: DRILL-4130 > URL: https://issues.apache.org/jira/browse/DRILL-4130 > Project: Apache Drill > Issue Type: Improvement > Components: Metadata > Affects Versions: 1.3.0 > Environment: All > Reporter: John Omernik > Labels: administration, settings > Fix For: Future > > > There are a number of settings within drill for handling data that due to low > level of granularity there may be unintended data reading consequences. A few > examples include: > store.json.read_numbers_as_double > and > store.json.all_text_mode > (There are likely more, these are some I've worked with) > The documentation on https://drill.apache.org/docs/json-data-model/ outlines > how when dealing with certain types of data, that these settings can be > helpful for reading, and indeed some queries fail with a suggestion to change > these settings. > A few points here. 1. The documentation suggests alter system commands. This > is not ideal as it changes the default way drill handles data for all users > AND not all users will (should) have the privs to enter this command. The > documentation at a minimum should show alter session (or provide a clearer > understanding of the difference) > But even with alter session, that affects reads for all JSON files for that > session, when in reality, the reasoning behind the setting is to be able to > read a specific table that has poorly formed JSON. Thus, issuing a command > that alters how Drill reads all JSON in order to read one table of JSON could > have unintended consequences, especially for a user who just wants to be able > to read things and issues commands without thinking things through. > Now as an administrator, there are two use cases here. One is I have a table > of poorly formed JSON that requires one of these settings, and I can't change > the source, therefore, can I create a view that makes it so all reads of this > table are done with the more permissive setting? Setting these in a view > would be very helpful from an administrator perspective for known bad data > sources. Keep users from having to think about it, and let them do their > exploration. > The other use case, is the ability for a user to set a session level read > that only applies for the table being read. alter session set > "%tablename%.store.json.read_numbers_as_double = true" (and have the errors > that display use that as the default suggestion) that way, the user can issue > the command, but not have downstream consequences in their session while > reading other tables. > Either case is valuable to an administrator, and could help prevent data read > issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)