[ 
https://issues.apache.org/jira/browse/BEAM-11075?focusedWorklogId=507991&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-507991
 ]

ASF GitHub Bot logged work on BEAM-11075:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 05/Nov/20 07:18
            Start Date: 05/Nov/20 07:18
    Worklog Time Spent: 10m 
      Work Description: tszerszen commented on a change in pull request #13245:
URL: https://github.com/apache/beam/pull/13245#discussion_r517835487



##########
File path: sdks/go/pkg/beam/io/synthetic/source.go
##########
@@ -131,8 +130,19 @@ func (fn *sourceFn) ProcessElement(rt *sdf.LockRTracker, 
config SourceConfig, em
        for i := rt.GetRestriction().(offsetrange.Restriction).Start; 
rt.TryClaim(i) == true; i++ {
                key := make([]byte, config.KeySize)
                val := make([]byte, config.ValueSize)
-               if _, err := fn.rng.Read(key); err != nil {
-                       return err
+               generator := sourceFn{}
+               generator.rng = rand.New(rand.NewSource(i))

Review comment:
       @youngoli thank you for your feedback. Since I have the most experience 
with Python, when I was implementing it I was following the way it was 
implemented in Python.
   
   In Python generator is seed with each index, therefore I did the same in Go, 
and I seed the generator with a NewSource on each iteration. There is no salt 
in config in other SDKs implementation and the idea behind it was: To be 
consistent with other SDKs to be able to compare them.
   ```python
     def _gen_kv_pair(self, generator, index):
       generator.seed(index)
       rand = generator.random_sample()
   
       # Determines whether to generate hot key or not.
       if rand < self._hot_key_fraction:
         # Generate hot key.
         # An integer is randomly selected from the range [0, numHotKeys-1]
         # with equal probability.
         generator_hot = Generator(index % self._num_hot_keys)
         bytes_ = generator_hot.bytes(self._key_size), generator.bytes(
           self._value_size)
       else:
         bytes_ = generator.bytes(self.element_size)
         bytes_ = bytes_[:self._key_size], bytes_[self._key_size:]
       return bytes_
   
     def read(self, range_tracker):
       index = range_tracker.start_position()
       generator = Generator()
       while range_tracker.try_claim(index):
         time.sleep(self._sleep_per_input_record_sec)
         yield self._gen_kv_pair(generator, index)
         index += 1
   ```
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 507991)
    Time Spent: 4h 10m  (was: 4h)

> Load Tests for Go SDK
> ---------------------
>
>                 Key: BEAM-11075
>                 URL: https://issues.apache.org/jira/browse/BEAM-11075
>             Project: Beam
>          Issue Type: Test
>          Components: sdk-go, testing
>            Reporter: Kamil Wasilewski
>            Assignee: Kamil Wasilewski
>            Priority: P3
>          Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> We have Load Tests for Python and Java SDKs[1], but we are missing the ones 
> for Go SDK.
> Tests to be done:
>  * ParDo
>  * Combine
>  * coGBK
>  * GBK
>  * Side Input
> The tests should run on Dataflow and Flink. The tests should be using 
> synthetic source and be running in batch mode.
> [1]http://metrics.beam.apache.org/dashboards/f/OtXje1iGz/performance-tests-metrics



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to