alamb commented on a change in pull request #9534:
URL: https://github.com/apache/arrow/pull/9534#discussion_r580155082



##########
File path: rust/arrow/src/csv/reader.rs
##########
@@ -99,7 +99,27 @@ fn infer_field_schema(string: &str) -> DataType {
 /// If `max_read_records` is not set, the whole file is read to infer its 
schema.
 ///
 /// Return infered schema and number of records used for inference.
-fn infer_file_schema<R: Read + Seek>(
+pub fn infer_file_schema<R: Read + Seek>(
+    reader: &mut R,
+    delimiter: u8,
+    max_read_records: Option<usize>,
+    has_header: bool,
+) -> Result<(Schema, usize)> {
+    let (schema, records_count) =
+        infer_schema_from_reader(reader, delimiter, max_read_records, 
has_header)?;
+    // return the reader seek back to the start

Review comment:
       Seeking back to the original location is what *I* personally would 
expect, but I don't know the history / rationale for the current behavior so 
there might be a reason for it. It looks like it has been there since the 
[initial 
commit](https://github.com/apache/arrow/commit/ac45f3210a194049ef35f49847dbc4ff5e70d48f#diff-bee180737bbaeb6af694ff28ffe48f0426aa5555c0eda3bdd690636f91764551R178)
  @nevi-me  perhaps do you remember if there was any rationale for this 
behavior?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to