[
https://issues.apache.org/jira/browse/CRUNCH-668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16418653#comment-16418653
]
Clément MATHIEU commented on CRUNCH-668:
----------------------------------------
Patch updated.
It restores the ability to pass a file as path. New logic mimics what
{{SourceTargetHelper#getPathSize}} does. My understanding is that it is what
Crunch aims to support but a careful review is welcome as it seems easy to get
it wrong.
I also spotted a few places where globs are not supported. For example, passing
a glob to a Source and materializing the resulting PCol fails while adding an
intermediate identity DoFn makes it work. Unfortunately, I don't have time to
fix them as they are not on my critical path.
> From.avroFile do not support globbing patterns (GenericData based overloads)
> ----------------------------------------------------------------------------
>
> Key: CRUNCH-668
> URL: https://issues.apache.org/jira/browse/CRUNCH-668
> Project: Crunch
> Issue Type: Improvement
> Components: Core
> Affects Versions: 0.15.0
> Reporter: Clément MATHIEU
> Assignee: Josh Wills
> Priority: Major
> Attachments:
> 0001-CRUNCH-668-Support-globbing-patterns-in-From-avroFil-v2.patch,
> 0001-CRUNCH-668-Support-globbing-patterns-in-From-avroFil.patch
>
>
> GenericData based overloads of {{From.avroFile}} throws a RuntimeException
> when a globbing pattern is provided. I see no reason to not support globbing
> patterns here as it works fine with {{textFile}} and SpecificData based
> overloads.
> The issue is that the code extracting Avro schema from the first file use
> {{listStatus}} rather than {{globStatus}}.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)