[ 
https://issues.apache.org/jira/browse/BEAM-5627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16792477#comment-16792477
 ] 

Juta Staes edited comment on BEAM-5627 at 3/14/19 9:21 AM:
-----------------------------------------------------------

I investigated why this test leaves warnings:

This test executes two threads  where one will try to read data and the other 
thread will try to spilt the data which is only possible if the first thread 
did not finish executing. The test will not leave any warnings if both 
execution orders are observed when running the threads 100 times (at least once 
succeeding the split and at least once failing). In python 3 threads are 
implemented differently which results in the threads always executing in the 
same order even when executing 1000 times because the threads duration is too 
short to be interrupted. If I increase the duration of both threads by adding 
`time.sleep(0.003)` the test does not leave any warnings.

Test code regarding threads: 
[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/source_test_utils.py#L651]

Conclusion: this test needs to be redesigned for Python 3. Do you have any 
suggestions on how to handle this?

cc: [~chamikara], [~tvalentyn]


was (Author: juta):
I investigated why this test leaves warnings:

This test executes two threads  where one will try to read data and the other 
thread will try to spilt the data which is only possible if the first thread 
did not finish executing. The test will not leave any warnings if both 
execution orders are observed when running the threads 100 times (at least once 
succeeding the split and at least once failing). In python 3 threads are 
implemented differently which results in the threads always executing in the 
same order even when executing 1000 times because the threads duration is too 
short to be interrupted. If I increase the duration of both threads by adding 
`time.sleep(0.003)` the test does not leave any warnings.

Test code regarding threads: 
[https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/source_test_utils.py#L651]

Conclusion: this test needs to be redesigned for Python 3. Do you have any 
suggestions on how to handle this?

cc: @chamikaramj, @tvalentyn

> Investigate why test_split_at_fraction_exhaustive consistently fails to split 
> after 101 attempts on Python 3
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: BEAM-5627
>                 URL: https://issues.apache.org/jira/browse/BEAM-5627
>             Project: Beam
>          Issue Type: Sub-task
>          Components: sdk-py-core
>            Reporter: Valentyn Tymofieiev
>            Assignee: Juta Staes
>            Priority: Minor
>              Labels: triaged
>             Fix For: Not applicable
>
>          Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> ERROR: test_split_at_fraction_exhaustive 
> (apache_beam.io.source_test_utils_test.SourceTestUtilsTest)
>  ----------------------------------------------------------------------
>  Traceback (most recent call last):
>    File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/source_test_utils_test.py",
>  line 120, in test_split_at_fraction_exhaustive
>      source = self._create_source(data)
>    File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/source_test_utils_test.py",
>  line 43, in _create_source
>      source = LineSource(self._create_file_with_data(data))
>    File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/apache_beam/io/source_test_utils_test.py",
>  line 35, in _create_file_with_data
>      f.write(line + '\n')
>    File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/target/.tox/py3/lib/python3.5/tempfile.py",
>  line 622, in func_wrapper
>      return func(*args, **kwargs)
> TypeError: a bytes-like object is required, not 'str'
> Also similar:
> ======================================================================
>  ERROR: test_file_sink_writing 
> (apache_beam.io.filebasedsink_test.TestFileBasedSink)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>    File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/   
>        apache_beam/io/filebasedsink_test.py", line 121, in 
> test_file_sink_writing
>       init_token, writer_results = self._common_init(sink)
>     File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/   
>        apache_beam/io/filebasedsink_test.py", line 103, in _common_init
>       writer1 = sink.open_writer(init_token, '1')
>     File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/   
>        apache_beam/options/value_provider.py", line 133, in _f
>       return fnc(self, *args, **kwargs)
>     File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/   
>        apache_beam/io/filebasedsink.py", line 185, in open_writer
>     return FileBasedSinkWriter(self, os.path.join(init_result, uid) + suffix)
>     File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/   
>        apache_beam/io/filebasedsink.py", line 385, in __init__
>       self.temp_handle = self.sink.open(temp_shard_path)
>     File 
> "/usr/local/google/home/valentyn/projects/beam/clean_head/beam/sdks/python/   
>        apache_beam/io/filebasedsink_test.py", line 82, in open
>       file_handle.write('[start]')
>   TypeError: a bytes-like object is required, not 'str'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to