lukecwik commented on code in PR #24630:
URL: https://github.com/apache/beam/pull/24630#discussion_r1089546389
##########
sdks/java/io/csv/src/main/java/org/apache/beam/sdk/io/csv/CsvIO.java:
##########
@@ -280,55 +281,53 @@ public Write<T> withCompression(Compression compression) {
return
toBuilder().setTextIOWrite(getTextIOWrite().withCompression(compression)).build();
}
+ /** Specifies all data written without spilling, simplifying the pipeline.
*/
Review Comment:
Note that the parent javadoc states:
```
Whether to skip the spilling of data caused by having maxNumWritersPerBundle.
```
Maybe we could fix it and this to provide greater detail as to what this
parameter actually does.
##########
sdks/java/io/csv/src/main/java/org/apache/beam/sdk/io/csv/CsvIO.java:
##########
@@ -415,28 +413,42 @@ public WriteFilesResult<String> expand(PCollection<T>
input) {
PCollection<String> csv = rows.apply("To CSV",
MapElements.into(strings()).via(toCsvFn));
- return csv.apply("Write CSV",
getTextIOWrite().withHeader(header).withOutputFilenames());
+ return csv.apply("Write CSV", write.withOutputFilenames());
}
- CSVFormat applyRequiredCSVFormatSettings(Schema schema) {
- CSVFormat csvFormat = getCSVFormat().withSkipHeaderRecord();
+ private static CSVFormat buildHeaderFromSchemaIfNeeded(CSVFormat
csvFormat, Schema schema) {
if (csvFormat.getHeader() == null) {
csvFormat =
csvFormat.withHeader(schema.sorted().getFieldNames().toArray(new String[0]));
}
+
return csvFormat;
}
- private static String formatHeader(CSVFormat csvFormat) {
+ private static TextIO.Write writeWithCSVFormatHeaderAndComments(
+ CSVFormat csvFormat, TextIO.Write write) {
+
+ if (csvFormat.getSkipHeaderRecord()) {
+ return write;
+ }
+
String[] header = requireNonNull(csvFormat.getHeader());
List<String> result = new ArrayList<>();
if (csvFormat.getHeaderComments() != null) {
for (String comment : csvFormat.getHeaderComments()) {
result.add(csvFormat.getCommentMarker() + " " + comment);
}
}
+
CSVFormat withoutHeaderComments = csvFormat.withHeaderComments();
Review Comment:
nit: in follow-up did you mean to name this `withHeaderComments`?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]