The config für dfs in the UI looks like this:

{
  "type": "file",
  "connection": "file:///",
  "workspaces": {
    "tmp": {
      "location": "/tmp",
      "writable": true,
      "defaultInputFormat": null,
      "allowAccessOutsideWorkspace": false
    },
    "root": {
      "location": "/",
      "writable": false,
      "defaultInputFormat": null,
      "allowAccessOutsideWorkspace": false
    },
    "home": {
      "location": "/Users/stefan",
      "writable": true,
      "defaultInputFormat": null,
      "allowAccessOutsideWorkspace": false
    }
  },
  "formats": {
    "parquet": {
      "type": "parquet"
    },
    "json": {
      "type": "json",
      "extensions": [
        "json"
      ]
    },
    "excel": {
      "type": "excel",
      "extensions": [
        "xlsx"
      ],
      "lastRow": 1048576,
      "ignoreErrors": true,
      "maxArraySize": -1,
      "thresholdBytesForTempFiles": -1
    },
    "spss": {
      "type": "spss",
      "extensions": [
        "sav"
      ]
    },
    "iceberg": {
      "type": "iceberg",
      "properties": null,
      "caseSensitive": null,
      "includeColumnStats": null,
      "ignoreResiduals": null,
      "snapshotId": null,
      "snapshotAsOfTime": null,
      "fromSnapshotId": null,
      "toSnapshotId": null
    },
    "httpd": {
      "type": "httpd",
      "extensions": [
        "httpd"
      ],
      "logFormat": "common\ncombined"
    },
    "xml": {
      "type": "xml",
      "extensions": [
        "xml"
      ],
      "dataLevel": 1
    },
    "syslog": {
      "type": "syslog",
      "extensions": [
        "syslog"
      ],
      "maxErrors": 10
    },
    "msaccess": {
      "type": "msaccess",
      "extensions": [
        "mdb",
        "accdb"
      ]
    },
    "hdf5": {
      "type": "hdf5",
      "extensions": [
        "h5"
      ],
      "defaultPath": null
    },
    "ltsv": {
      "type": "ltsv",
      "extensions": [
        "ltsv"
      ],
      "parseMode": "lenient",
      "escapeCharacter": null,
      "kvDelimiter": null,
      "entryDelimiter": null,
      "lineEnding": null,
      "quoteChar": null
    },
    "delta": {
      "type": "delta",
      "version": null,
      "timestamp": null
    },
    "shp": {
      "type": "shp",
      "extensions": [
        "shp"
      ]
    },
    "image": {
      "type": "image",
      "extensions": [
        "jpg",
        "jpeg",
        "jpe",
        "tif",
        "tiff",
        "dng",
        "psd",
        "png",
        "bmp",
        "gif",
        "ico",
        "pcx",
        "wav",
        "wave",
        "avi",
        "webp",
        "mov",
        "mp4",
        "m4a",
        "m4p",
        "m4b",
        "m4r",
        "m4v",
        "3gp",
        "3g2",
        "eps",
        "epsf",
        "epsi",
        "ai",
        "arw",
        "crw",
        "cr2",
        "nef",
        "orf",
        "raf",
        "rw2",
        "rwl",
        "srw",
        "x3f"
      ],
      "fileSystemMetadata": true,
      "descriptive": true
    },
    "pdf": {
      "type": "pdf",
      "extensions": [
        "pdf"
      ],
      "extractHeaders": true,
      "extractionAlgorithm": "basic"
    },
    "sas": {
      "type": "sas",
      "extensions": [
        "sas7bdat"
      ]
    },
    "pcap": {
      "type": "pcap",
      "extensions": [
        "pcap",
        "pcapng"
      ]
    }
  },
  "authMode": "SHARED_USER",
  "enabled": true
}

I'm now able to query some XML data: "SELECT * FROM
dfs.home.`ch.so.afu.abbaustellen.xml`;" Which I actually don't want to be
able to (see formats in the "storage-plugins-override.conf" file). If I
remove the xml format section in the config in the UI, I'm not able to
query the xml anymore: "Error: VALIDATION ERROR: From line 1, column 15 to
line 1, column 51: Object 'ch.so.afu.abbaustellen.xml' not found within
'dfs.home'".

regards
Stefan


On Mon, Jul 10, 2023 at 9:15 PM Charles Givre <cgi...@gmail.com> wrote:

> HI Stefan,
> What's in the config in the UI?  Can you also please clarify what queries
> are running which indicate that your configs aren't working?
> Best,
> -- C
>
>
>
> > On Jul 10, 2023, at 1:11 PM, Stefan Ziegler <stefan.ziegler...@gmail.com>
> wrote:
> >
> > "storage": {
> >  cp: {
> >    type: "file",
> >    connection: "classpath:///",
> >    formats: {
> >      "csv" : {
> >        type: "text",
> >        extensions: [ "csv" ],
> >        delimiter: ","
> >      }
> >    }
> >    enabled: true
> >  }
> > }
> > "storage": {
> >  dfs: {
> >    type: "file",
> >    connection: "file:///",
> >    workspaces: {
> >      "tmp": {
> >        "location": "/tmp",
> >        "writable": true,
> >        "defaultInputFormat": null,
> >        "allowAccessOutsideWorkspace": false
> >      },
> >      "home": {
> >        "location": "/Users/stefan",
> >        "writable": true,
> >        "defaultInputFormat": null,
> >        "allowAccessOutsideWorkspace": false
> >      },
> >      "root": {
> >        "location": "/",
> >        "writable": false,
> >        "defaultInputFormat": null,
> >        "allowAccessOutsideWorkspace": false
> >      }
> >    },
> >    formats: {
> >      "parquet": {
> >        "type": "parquet"
> >      },
> >      "json": {
> >        "type": "json",
> >        "extensions": [
> >          "json"
> >        ]
> >      }
> >    },
> >    enabled: true
> >  }
> > }
> > "storage": {
> >  s3: {
> >    type: "file",
> >    connection: "s3a://<my-bucket-name>",
> >    config: {
> >      "fs.s3a.aws.credentials.provider":
> > "org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider",
> >      "fs.s3a.endpoint": "s3.eu-central-1.amazonaws.com",
> >      "fs.s3a.impl.disable.cache": "false"
> >    },
> >    workspaces: {
> >      "root": {
> >        "location": "/",
> >        "writable": false,
> >        "defaultInputFormat": "parquet",
> >        "allowAccessOutsideWorkspace": false
> >      }
> >    },
> >    "formats": {
> >      "parquet": {
> >        "type": "parquet"
> >      }
> >    },
> >    enabled: true
> >  }
> > }
> >
> >
> >
> >
> > On Mon, Jul 10, 2023 at 6:40 PM Charles Givre <cgi...@gmail.com> wrote:
> >
> >> Can you share your configs with any sensitive info redacted?  The lists
> >> don't support images, so please just cut/paste the json.
> >> I had another idea...
> >> -- C
> >>
> >>
> >>> On Jul 10, 2023, at 12:28 PM, Stefan Ziegler <
> >> stefan.ziegler...@gmail.com> wrote:
> >>>
> >>> Yes, I think I'm following these instructions. And the file is not
> >>> completely ignored. It creates additional format definitions. Let's
> say I
> >>> white list some formats in my storage configuration and Drill adds more
> >>> formats (which I don't want). Is there another way to start a "vanilla"
> >>> Drill installation with my own configurations?
> >>>
> >>> Stefan
> >>>
> >>> On Mon, Jul 10, 2023 at 6:17 PM Charles Givre <cgi...@gmail.com>
> wrote:
> >>>
> >>>> Hi Stefan,
> >>>> My apologies.. Ok.. so the issue is that the
> >> storage-plugins-override.conf
> >>>> is being ignored.  I've never actually used this feature, so I wasn't
> >>>> familiar with it, but are you folllowing the instructions here [1]
> with
> >>>> respect to configuration and restarting Drill?  My suggestion would be
> >> to
> >>>> remove all the plugins in the UI and only specify them in the .conf
> >> file.
> >>>> Drill has an order of precedence and I suspect what is happening is
> that
> >>>> the UI versions have a higher priority than the .conf versions.   Does
> >> that
> >>>> make sense?
> >>>>
> >>>> -- C
> >>>>
> >>>> [1]:
> >>>>
> >>
> https://drill.apache.org/docs/configuring-storage-plugins/#configuring-storage-plugins-with-the-storage-plugins-overrideconf-file
> >>>>
> >>>>
> >>>>
> >>>>> On Jul 10, 2023, at 12:06 PM, Stefan Ziegler <
> >>>> stefan.ziegler...@gmail.com> wrote:
> >>>>>
> >>>>> Hi Charles
> >>>>>
> >>>>> I use a "storage-plugins-override.conf" file. My attempt is to have
> the
> >>>>> configuration for my storages in a single file and Drill can pick up
> >> the
> >>>>> configuration on startup. I put "storage-plugins-override.conf" in
> the
> >>>> conf
> >>>>> directory and Drill creates the storages on startup but (and that is
> my
> >>>>> problem) also creates all formats for every storage defined in my
> >> config
> >>>>> file. E.g. I have a (local) file type storage and I define two
> formats
> >>>>> (parquet and json) in it. Drill does not respect my restriction to
> two
> >>>>> formats in the config file but creates all formats known to Drill
> (like
> >>>>> iceberg, xml etc.).
> >>>>>
> >>>>> regards
> >>>>> Stefan
> >>>>>
> >>>>> On Mon, Jul 10, 2023 at 5:30 PM Charles Givre <cgi...@gmail.com>
> >> wrote:
> >>>>>
> >>>>>> HI Stefan,
> >>>>>> Thanks for your interest in Drill.  You have to define the format
> >> config
> >>>>>> for each storage plugin.  Otherwise Drill doesn't know what
> extension
> >> to
> >>>>>> associate with what format plugin.  Out of curiosity, why are you
> >> using
> >>>> the
> >>>>>> .conf files for this?
> >>>>>> -- C
> >>>>>>
> >>>>>>
> >>>>>>> On Jul 9, 2023, at 12:03 PM, Stefan Ziegler <
> >>>> stefan.ziegler...@gmail.com>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>> Not defining a format seems to prevent the user from querying the
> >>>>>> specific
> >>>>>>> format. E.g. after deleting the xml format definition in the web
> gui,
> >>>> I'm
> >>>>>>> not able to query xml files anymore. So I guess my assumption was
> >>>> right.
> >>>>>>>
> >>>>>>> Stefan
> >>>>>>>
> >>>>>>> On Sun, Jul 9, 2023 at 5:41 PM Stefan Ziegler <
> >>>>>> stefan.ziegler...@gmail.com>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Btw: I assumed that the list of formats act as a restriction.
> >> Probably
> >>>>>> I'm
> >>>>>>>> wrong.
> >>>>>>>>
> >>>>>>>> Stefan
> >>>>>>>>
> >>>>>>>> On Sun, Jul 9, 2023 at 5:27 PM Stefan Ziegler <
> >>>>>> stefan.ziegler...@gmail.com>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Hi
> >>>>>>>>>
> >>>>>>>>> I'm using storage-plugins-override.conf to configure the storage
> >>>>>> plugins
> >>>>>>>>> on startup. My storage configurations contain only one or two
> >> formats
> >>>>>>>>> (parquet, json, csv). Checking the storages in the web gui I
> >> noticed
> >>>>>> that
> >>>>>>>>> for all the storages all formats are enabled, e.g. msaccess,
> >> iceberg
> >>>>>> etc.
> >>>>>>>>>
> >>>>>>>>> Is this on purpose or did I do something wrong?
> >>>>>>>>>
> >>>>>>>>> Example configuration:
> >>>>>>>>>
> >>>>>>>>> "storage": {
> >>>>>>>>> dfs: {
> >>>>>>>>> type: "file",
> >>>>>>>>> connection: "file:///",
> >>>>>>>>> workspaces: {
> >>>>>>>>>   "tmp": {
> >>>>>>>>>     "location": "/tmp",
> >>>>>>>>>     "writable": true,
> >>>>>>>>>     "defaultInputFormat": null,
> >>>>>>>>>     "allowAccessOutsideWorkspace": false
> >>>>>>>>>   },
> >>>>>>>>>   "root": {
> >>>>>>>>>     "location": "/",
> >>>>>>>>>     "writable": false,
> >>>>>>>>>     "defaultInputFormat": null,
> >>>>>>>>>     "allowAccessOutsideWorkspace": false
> >>>>>>>>>   }
> >>>>>>>>> },
> >>>>>>>>> formats: {
> >>>>>>>>>   "parquet": {
> >>>>>>>>>     "type": "parquet"
> >>>>>>>>>   },
> >>>>>>>>>   "json": {
> >>>>>>>>>     "type": "json",
> >>>>>>>>>     "extensions": [
> >>>>>>>>>       "json"
> >>>>>>>>>     ]
> >>>>>>>>>   }
> >>>>>>>>> },
> >>>>>>>>> enabled: true
> >>>>>>>>> }
> >>>>>>>>> }
> >>>>>>>>>
> >>>>>>>>> regards
> >>>>>>>>> Stefan
> >>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>>
> >>>>
> >>>>
> >>
> >>
>
>

Reply via email to