Re: reading from s3 file in aws
Filed BEAM-2500 as a feature request. On Thu, Jun 22, 2017 at 9:00 AM, tarush grover wrote: > Hi All, > > Can we add a module s3-file-system in beam to directly support and have > integration with s3? > > Regards, > Tarush > > On Thu, 22 Jun 2017 at 9:21 PM, Lukasz Cwik > wrote: > > > You want to depend on the Hadoop File System module[1] and configure > > HadoopFileSystemOptions[2] with a S3 configuration[3]. > > > > 1: > > https://github.com/apache/beam/tree/master/sdks/java/io/ > hadoop-file-system > > 2: > > > > https://github.com/apache/beam/blob/master/sdks/java/io/ > hadoop-file-system/src/main/java/org/apache/beam/sdk/io/ > hdfs/HadoopFileSystemOptions.java#L53 > > 3: https://wiki.apache.org/hadoop/AmazonS3 > > > > On Wed, Jun 21, 2017 at 10:25 PM, Jyotirmoy Sundi > > wrote: > > > > > > > > Hi Folks, > > > > > >Is there any way to read from s3 buckets in beam, > > > > > > Trace: > > > Exception in thread "main" java.lang.IllegalStateException: Failed to > > > validate s3://my_bucket/path/to/input-*.csv > > > at org.apache.beam.sdk.io.TextIO$Read$Bound.expand(TextIO.java:254) > > > at org.apache.beam.sdk.io.TextIO$Read$Bound.expand(TextIO.java:165) > > > at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:420) > > > at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:334) > > > at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:47) > > > at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:157) > > > at com.intuit.pml.sessions.Eyeball.main(Eyeball.java:77) > > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > > at sun.reflect.NativeMethodAccessorImpl.invoke( > > > NativeMethodAccessorImpl.java:62) > > > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > > > DelegatingMethodAccessorImpl.java:43) > > > at java.lang.reflect.Method.invoke(Method.java:498) > > > at com.intellij.rt.execution.application.AppMain.main( > AppMain.java:144) > > > Caused by: java.io.IOException: Unable to find handler for > > > s3://my_bucket/path/to/input-*.csv > > > at org.apache.beam.sdk.util.IOChannelUtils.getFactory( > > > IOChannelUtils.java:307) > > > at org.apache.beam.sdk.io.TextIO$Read$Bound.expand(TextIO.java:249) > > > ... 11 more > > > -- > > > Best Regards, > > > > > >
Re: reading from s3 file in aws
You want to depend on the Hadoop File System module[1] and configure HadoopFileSystemOptions[2] with a S3 configuration[3]. 1: https://github.com/apache/beam/tree/master/sdks/java/io/hadoop-file-system 2: https://github.com/apache/beam/blob/master/sdks/java/io/hadoop-file-system/src/main/java/org/apache/beam/sdk/io/hdfs/HadoopFileSystemOptions.java#L53 3: https://wiki.apache.org/hadoop/AmazonS3 On Wed, Jun 21, 2017 at 10:25 PM, Jyotirmoy Sundi wrote: > > Hi Folks, > >Is there any way to read from s3 buckets in beam, > > Trace: > Exception in thread "main" java.lang.IllegalStateException: Failed to > validate s3://my_bucket/path/to/input-*.csv > at org.apache.beam.sdk.io.TextIO$Read$Bound.expand(TextIO.java:254) > at org.apache.beam.sdk.io.TextIO$Read$Bound.expand(TextIO.java:165) > at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:420) > at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:334) > at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:47) > at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:157) > at com.intuit.pml.sessions.Eyeball.main(Eyeball.java:77) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke( > NativeMethodAccessorImpl.java:62) > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144) > Caused by: java.io.IOException: Unable to find handler for > s3://my_bucket/path/to/input-*.csv > at org.apache.beam.sdk.util.IOChannelUtils.getFactory( > IOChannelUtils.java:307) > at org.apache.beam.sdk.io.TextIO$Read$Bound.expand(TextIO.java:249) > ... 11 more > -- > Best Regards, >
reading from s3 file in aws
Hi Folks, Is there any way to read from s3 buckets in beam, Trace: Exception in thread "main" java.lang.IllegalStateException: Failed to validate s3://my_bucket/path/to/input-*.csv at org.apache.beam.sdk.io.TextIO$Read$Bound.expand(TextIO.java:254) at org.apache.beam.sdk.io.TextIO$Read$Bound.expand(TextIO.java:165) at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:420) at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:334) at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:47) at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:157) at com.intuit.pml.sessions.Eyeball.main(Eyeball.java:77) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144) Caused by: java.io.IOException: Unable to find handler for s3://my_bucket/path/to/input-*.csv at org.apache.beam.sdk.util.IOChannelUtils.getFactory(IOChannelUtils.java:307) at org.apache.beam.sdk.io.TextIO$Read$Bound.expand(TextIO.java:249) ... 11 more -- Best Regards,