I'm writing a schema aware IO abstraction in core beam, SchemaIOProvider
<https://github.com/apache/beam/blob/9b0941945545e71a949649309e05e405ca73aea2/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/io/SchemaIOProvider.java>,
and implementing it by shifting the logic of IO table providers and tables
from Beam SQL to the location of its IO (ie PubsubSchemaCapableIOProvider
<https://github.com/apache/beam/blob/9b0941945545e71a949649309e05e405ca73aea2/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubSchemaCapableIOProvider.java>).
This creates one building point for using the logic (design docs
<https://docs.google.com/document/d/1ic3P8EVGHIydHQ-VMDKbN9kEdwm7sBXMo80VrhwksvI/edit>
).

I'm trying to implement a TextSchemaCapableIOProvider in Beam Core (like
PubsubSchemaCapableIOProvider
<https://github.com/apache/beam/blob/9b0941945545e71a949649309e05e405ca73aea2/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubSchemaCapableIOProvider.java>),
where TextIO is, but the table logic
<https://github.com/apache/beam/blob/9b0941945545e71a949649309e05e405ca73aea2/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/text/TextTableProvider.java#L102-L126>
relies on org.apache.commons.csv.CSVFormat. What are your thoughts on
adding the dependency org.apache.commons.csv.CSVFormat to Core Beam? Or
should I try finding a work around?

Reply via email to