kosiew commented on code in PR #22839:
URL: https://github.com/apache/datafusion/pull/22839#discussion_r3408057313
##########
datafusion-cli/src/main.rs:
##########
@@ -330,8 +340,16 @@ fn get_session_config(args: &Args) ->
Result<SessionConfig> {
config_options.format.null = String::from("NULL");
}
- let session_config =
+ let mut session_config =
SessionConfig::from(config_options).with_information_schema(true);
+
+ if args.repl_mode() {
Review Comment:
Nice improvement here. The new guard correctly marks stdin as command input
when `args.repl_mode()` is true.
One case still seems to slip through though: `datafusion-cli -f /dev/stdin`.
In that mode, stdin is also carrying SQL commands, but `repl_mode()` is false
because a file argument is present. Since `parse_valid_file` accepts
`/dev/stdin`, `exec_from_files` ends up opening stdin as a regular file.
That means `StdinCarriesCommands` is not installed, so
`StdinUtils::object_store` can still try to read process stdin for table data
after the SQL reader has already consumed or buffered it.
For example:
```bash
printf "CREATE EXTERNAL TABLE t STORED AS CSV LOCATION '/dev/stdin';\nSELECT
123;\n" \
| datafusion-cli -q -f /dev/stdin
```
In this scenario, the table registration path can still end up reading stdin
instead of returning the intended `stdin is already being read for SQL
commands` error.
Could we extend the guard to cover file arguments that resolve to stdin
pseudo-files (`/dev/stdin`, `/dev/fd/0`, `/proc/self/fd/0`) as well? A
regression test covering `-f /dev/stdin` would also be helpful.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]