[jira] [Updated] (BEAM-5180) Broken FileResultCoder via parseSchema change
[ https://issues.apache.org/jira/browse/BEAM-5180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jozef Vilcek updated BEAM-5180: --- Description: Recently this commit [https://github.com/apache/beam/commit/3fff58c21f94415f3397e185377e36d3df662384] introduced more strict schema parsing which is breaking the contract between _FileResultCoder_ and _FileSystems.matchNewResource()_. Coder takes _ResourceId_ and serialize it via `_toString_` methods and then relies on filesystem being able to parse it back again. Having strict _scheme://_ breaks this at least for Hadoop filesystem which use _URI_ for _ResourceId_ and produce _toString()_ in form of `_hdfs:/some/path_` I guess the _ResourceIdCoder_ is suffering the same problem. was: Recently this commit https://github.com/apache/beam/commit/3fff58c21f94415f3397e185377e36d3df662384 introduced more strict schema parsing which is breaking the contract between _FileResultCoder_ and _FileSystems.matchNewResource()_. Coder takes _ResourceId_ and serialize it via `_toString_` methods and then relies on filesystem being able to parse it back again. Having strict _scheme://_ breaks this at least for Hadoop filesystem which use _URI_ for _ResourceId_ and produce _toString()_ in form of `_hdfs:/some/path_` I guess the _ResourceIdCoder_ is suffering the same problem. Either scheme parsing should be less strict or ResourceId.toString() for hadoop fixed > Broken FileResultCoder via parseSchema change > - > > Key: BEAM-5180 > URL: https://issues.apache.org/jira/browse/BEAM-5180 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 2.6.0 >Reporter: Jozef Vilcek >Assignee: Kenneth Knowles >Priority: Blocker > > Recently this commit > [https://github.com/apache/beam/commit/3fff58c21f94415f3397e185377e36d3df662384] > introduced more strict schema parsing which is breaking the contract between > _FileResultCoder_ and _FileSystems.matchNewResource()_. > Coder takes _ResourceId_ and serialize it via `_toString_` methods and then > relies on filesystem being able to parse it back again. Having strict > _scheme://_ breaks this at least for Hadoop filesystem which use _URI_ for > _ResourceId_ and produce _toString()_ in form of `_hdfs:/some/path_` > I guess the _ResourceIdCoder_ is suffering the same problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-5180) Broken FileResultCoder via parseSchema change
[ https://issues.apache.org/jira/browse/BEAM-5180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jozef Vilcek updated BEAM-5180: --- Description: Recently this commit https://github.com/apache/beam/commit/3fff58c21f94415f3397e185377e36d3df662384 introduced more strict schema parsing which is breaking the contract between _FileResultCoder_ and _FileSystems.matchNewResource()_. Coder takes _ResourceId_ and serialize it via `_toString_` methods and then relies on filesystem being able to parse it back again. Having strict _scheme://_ breaks this at least for Hadoop filesystem which use _URI_ for _ResourceId_ and produce _toString()_ in form of `_hdfs:/some/path_` I guess the _ResourceIdCoder_ is suffering the same problem. Either scheme parsing should be less strict or ResourceId.toString() for hadoop fixed was: Recently this commit introduced more strict schema parsing which is breaking the contract between `FileResultCoder` and `FileSystems.matchNewResource()`. Coder takes `ResourceId` and serialize it via `toString` methods and then relies on filesystem being able to parse it back again. Having strict `scheme://` breaks this at least for `Hadoop` filesystem which use `URI for `ResourceId` and produce `toString()` in form of `hdfs:/some/path` I guess the `ResourceIdCoder` is suffering the same problem. Either scheme parsing should be less strict or `ResourceId.toString()` for `hadoop` fixed > Broken FileResultCoder via parseSchema change > - > > Key: BEAM-5180 > URL: https://issues.apache.org/jira/browse/BEAM-5180 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 2.6.0 >Reporter: Jozef Vilcek >Assignee: Kenneth Knowles >Priority: Blocker > > Recently this commit > https://github.com/apache/beam/commit/3fff58c21f94415f3397e185377e36d3df662384 > introduced more strict schema parsing which is breaking the contract between > _FileResultCoder_ and _FileSystems.matchNewResource()_. > Coder takes _ResourceId_ and serialize it via `_toString_` methods and then > relies on filesystem being able to parse it back again. Having strict > _scheme://_ breaks this at least for Hadoop filesystem which use _URI_ for > _ResourceId_ and produce _toString()_ in form of `_hdfs:/some/path_` > I guess the _ResourceIdCoder_ is suffering the same problem. > Either scheme parsing should be less strict or ResourceId.toString() for > hadoop fixed -- This message was sent by Atlassian JIRA (v7.6.3#76005)