Re: Please assist; how do i use a Sample transform ?
Many thanks for your help.. On Wed, Dec 18, 2019, 12:45 AM Kyle Weaver wrote: > We could make the Sample class uninstantiable to give a slightly more > specific error here. Not sure how much that would help though. > > On Tue, Dec 17, 2019 at 4:40 PM Robert Bradshaw > wrote: > >> beam.transforms.combiners.Sample is a container class that hails back >> to the days when folks more familiar with Java were just copying >> things over, and is just an empty class containing actual transforms >> (as Kyle indicates). These are shorthand for >> beam.CombineGlobally(beam.transforms.combiners.SampleCombineFn(...)), >> beam.CombinePerKey(beam.transforms.combiners.SampleCombineFn(...)), , >> etc. >> >> On Tue, Dec 17, 2019 at 2:10 PM Kyle Weaver wrote: >> > >> > Looks like you need to choose a subclass of sample. Probably >> FixedSizeGlobally in your case. For example, >> > >> > beam.transforms.combiners.Sample.FixedSizeGlobally(5) >> > >> > Source: >> https://github.com/apache/beam/blob/df376164fee1a8f54f3ad00c45190b813ffbdd34/sdks/python/apache_beam/transforms/combiners.py#L619 >> > >> > On Tue, Dec 17, 2019 at 2:01 PM Marco Mistroni >> wrote: >> >> >> >> HI all >> >> beam noob. >> >> >> >> i have written a beam app where i am processing content of a file >> >> for dbeugging purposes, i wanted to get a samle of the lines in the >> file..using >> >> the Sample combiner, but i cannot find any examples in python >> >> Here's my rough code >> >> >> >> ... >> >> | 'Filter only row longer than 100 chars' >> beam.Filter(lambda row: >> len(row) > 100) >> >> | 'sampling lines' >> beam.transforms.combiners.Sample() >> >> >> >> but the code above gives me >> >> >> >> TypeError: unsupported operand type(s) for >>: 'str' and 'Sample' >> >> Could anyone help? >> >> kind regards >> >> Marco >> >> >> >> >> >> >> >
Re: Please assist; how do i use a Sample transform ?
We could make the Sample class uninstantiable to give a slightly more specific error here. Not sure how much that would help though. On Tue, Dec 17, 2019 at 4:40 PM Robert Bradshaw wrote: > beam.transforms.combiners.Sample is a container class that hails back > to the days when folks more familiar with Java were just copying > things over, and is just an empty class containing actual transforms > (as Kyle indicates). These are shorthand for > beam.CombineGlobally(beam.transforms.combiners.SampleCombineFn(...)), > beam.CombinePerKey(beam.transforms.combiners.SampleCombineFn(...)), , > etc. > > On Tue, Dec 17, 2019 at 2:10 PM Kyle Weaver wrote: > > > > Looks like you need to choose a subclass of sample. Probably > FixedSizeGlobally in your case. For example, > > > > beam.transforms.combiners.Sample.FixedSizeGlobally(5) > > > > Source: > https://github.com/apache/beam/blob/df376164fee1a8f54f3ad00c45190b813ffbdd34/sdks/python/apache_beam/transforms/combiners.py#L619 > > > > On Tue, Dec 17, 2019 at 2:01 PM Marco Mistroni > wrote: > >> > >> HI all > >> beam noob. > >> > >> i have written a beam app where i am processing content of a file > >> for dbeugging purposes, i wanted to get a samle of the lines in the > file..using > >> the Sample combiner, but i cannot find any examples in python > >> Here's my rough code > >> > >> ... > >> | 'Filter only row longer than 100 chars' >> beam.Filter(lambda row: > len(row) > 100) > >> | 'sampling lines' >> beam.transforms.combiners.Sample() > >> > >> but the code above gives me > >> > >> TypeError: unsupported operand type(s) for >>: 'str' and 'Sample' > >> Could anyone help? > >> kind regards > >> Marco > >> > >> > >> >
Re: Please assist; how do i use a Sample transform ?
beam.transforms.combiners.Sample is a container class that hails back to the days when folks more familiar with Java were just copying things over, and is just an empty class containing actual transforms (as Kyle indicates). These are shorthand for beam.CombineGlobally(beam.transforms.combiners.SampleCombineFn(...)), beam.CombinePerKey(beam.transforms.combiners.SampleCombineFn(...)), , etc. On Tue, Dec 17, 2019 at 2:10 PM Kyle Weaver wrote: > > Looks like you need to choose a subclass of sample. Probably > FixedSizeGlobally in your case. For example, > > beam.transforms.combiners.Sample.FixedSizeGlobally(5) > > Source: > https://github.com/apache/beam/blob/df376164fee1a8f54f3ad00c45190b813ffbdd34/sdks/python/apache_beam/transforms/combiners.py#L619 > > On Tue, Dec 17, 2019 at 2:01 PM Marco Mistroni wrote: >> >> HI all >> beam noob. >> >> i have written a beam app where i am processing content of a file >> for dbeugging purposes, i wanted to get a samle of the lines in the >> file..using >> the Sample combiner, but i cannot find any examples in python >> Here's my rough code >> >> ... >> | 'Filter only row longer than 100 chars' >> beam.Filter(lambda row: >> len(row) > 100) >> | 'sampling lines' >> beam.transforms.combiners.Sample() >> >> but the code above gives me >> >> TypeError: unsupported operand type(s) for >>: 'str' and 'Sample' >> Could anyone help? >> kind regards >> Marco >> >> >>
Re: Please assist; how do i use a Sample transform ?
Looks like you need to choose a subclass of sample. Probably FixedSizeGlobally in your case. For example, beam.transforms.combiners.*Sample.FixedSizeGlobally(5)* Source: https://github.com/apache/beam/blob/df376164fee1a8f54f3ad00c45190b813ffbdd34/sdks/python/apache_beam/transforms/combiners.py#L619 On Tue, Dec 17, 2019 at 2:01 PM Marco Mistroni wrote: > HI all > beam noob. > > i have written a beam app where i am processing content of a file > for dbeugging purposes, i wanted to get a samle of the lines in the > file..using > the Sample combiner, but i cannot find any examples in python > Here's my rough code > > ... > | 'Filter only row longer than 100 chars' >> beam.Filter(lambda row: len(row) > > 100) > | 'sampling lines' >> beam.transforms.combiners.Sample() > > but the code above gives me > > TypeError: unsupported operand type(s) for >>: 'str' and 'Sample' > Could anyone help? > kind regards > Marco > > > >
Please assist; how do i use a Sample transform ?
HI all beam noob. i have written a beam app where i am processing content of a file for dbeugging purposes, i wanted to get a samle of the lines in the file..using the Sample combiner, but i cannot find any examples in python Here's my rough code ... | 'Filter only row longer than 100 chars' >> beam.Filter(lambda row: len(row) > 100) | 'sampling lines' >> beam.transforms.combiners.Sample() but the code above gives me TypeError: unsupported operand type(s) for >>: 'str' and 'Sample' Could anyone help? kind regards Marco