[
https://issues.apache.org/jira/browse/BEAM-11091?focusedWorklogId=506215&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-506215
]
ASF GitHub Bot logged work on BEAM-11091:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 29/Oct/20 13:29
Start Date: 29/Oct/20 13:29
Worklog Time Spent: 10m
Work Description: JozoVilcek commented on a change in pull request #13166:
URL: https://github.com/apache/beam/pull/13166#discussion_r514257586
##########
File path:
sdks/java/io/hadoop-format/src/main/java/org/apache/beam/sdk/io/hadoop/format/HadoopFormatIO.java
##########
@@ -437,6 +445,12 @@
.build();
}
+ /** Transforms the keys read from the source using the given key
translation function. */
+ public Read<K, V> withKeyTranslation(SimpleFunction<?, K> function,
Coder<K> coder) {
Review comment:
I did try to pass in `SerializableFunction` but inner code of
HadoopFormat does require input and output types for the function for
validation, e.g. that input format value class does match translation function
input class. I choose to stick with `SimpleFunction` but just allow to pass in
the specific coder.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 506215)
Time Spent: 1h (was: 50m)
> HadoopFormatIO should allow to specify coders
> ---------------------------------------------
>
> Key: BEAM-11091
> URL: https://issues.apache.org/jira/browse/BEAM-11091
> Project: Beam
> Issue Type: Improvement
> Components: io-java-hadoop-format
> Affects Versions: 2.24.0
> Reporter: Jozef Vilcek
> Assignee: Jozef Vilcek
> Priority: P3
> Time Spent: 1h
> Remaining Estimate: 0h
>
> HadoopFormatIO does allow only to pass in type descriptors for key and value
> and those are then resolved to Coder via CoderRegistry.
> We did find this restrictive after
> https://issues.apache.org/jira/browse/BEAM-9569
> which makes it impossible to emit Row out of the source.
> Easy solution I see can be to allow set Coder directly and bypass the
> resolution against registry in that case.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)