I did implementations and profiling of following:

*BigGrepStrFnd* - Boyer-Moore (grep)

*BigGrepBytes* - Rabin-Karp

*BigGrepStr - *Rabin-Karp

*BigGrepScan *- search with sliding window

Additionally implemented them using concurrency.

Tested on 100 files containing one 100MB line. Searching for 20 character 
string at the end of the line as worst case scenario.

The program is reading each file and loading the 100MB line in the memory 
for analysis. With 100 files I'm getting consistent results.

Here are the results. I needed to know also the memory requirements.

Created two profiles:

go test -run=NONE -bench=cpuprofile

go test -run=NONE -bench=memprofile

Then analyzed with go tool pprof

Computer Mac-mini M1 8 cores, 16 GB

Concurrent versions are performing the search in parallel on 8 slices of 
the 100MB line.

Fastest

*3.04s* -60.47GB -  BigGrepBytes_Concurrent ~2.85 times faster than non 
concurrent.

8.55s - 60.48GB BigGrepBytes

>From non concurrent versions the fastest is BigGrepStrFnd. Note it has 
bigger memory requirements.

*5.44s *- 72.96GB BigGrepStrFnd - Boyer-Moore (grep) ~1.63 times faster 
than BigGrepBytes

3.82s - 72.93GB BigGrepStrFndConcurrent

----

go test -run=NONE -bench=cpuprofile

goos: darwin

goarch: arm64

pkg: mobiledatabooks.com/biggrep

Benchmark_100MB_End__20charsBigGrepBytes_Concurrent-8              1   
 7653778209 ns/op

Benchmark_100MB_End__20charsBigGrepBytes_-8                        1   
 12326358083 ns/op

Benchmark_100MB_End__20charsBigGrepScan__-8                        1   
 14934611708 ns/op

Benchmark_100MB_End__20charsBigGrepStr___-8                        1   
 13178685417 ns/op

Benchmark_100MB_End__20charsBigGrepStr___Concurrent-8              1   
 8302944958 ns/op

Benchmark_100MB_End__20charsBigGrepStrFnd-8                        1   
 8765527416 ns/op

Benchmark_100MB_End__20charsBigGrepStrFndConcurrent-8              1   
 7361004166 ns/op

Benchmark_100MB_End__20charsBigGrepScan__Concurrent-8              1   
 8413663834 ns/op

PASS

ok     mobiledatabooks.com/biggrep    81.204s

----

----

Type: cpu

Time: Jun 8, 2022 at 10:15pm (PDT)

Duration: 81.11s, Total samples = 89.36s (110.17%)

Active filters:

   focus=Benchmark

   hide=benchm

   show=Benchmark

Showing nodes accounting for 47.88s, 53.58% of 89.36s total

      flat  flat%   sum%        cum   cum%

    10.37s 11.60% 11.60%     10.37s 11.60%  
mobiledatabooks.com/biggrep.Benchmark_100MB_End__20charsBigGrepScan__

     8.99s 10.06% 21.67%      8.99s 10.06%  
mobiledatabooks.com/biggrep.Benchmark_100MB_End__20charsBigGrepStr___

     8.55s  9.57% 31.23%      8.55s  9.57%  
mobiledatabooks.com/biggrep.Benchmark_100MB_End__20charsBigGrepBytes_

     5.44s  6.09% 37.32%      5.44s  6.09%  
mobiledatabooks.com/biggrep.Benchmark_100MB_End__20charsBigGrepStrFnd

     4.02s  4.50% 41.82%      4.02s  4.50%  
mobiledatabooks.com/biggrep.Benchmark_100MB_End__20charsBigGrepStr___Concurrent

     3.82s  4.27% 46.09%      3.82s  4.27%  
mobiledatabooks.com/biggrep.Benchmark_100MB_End__20charsBigGrepStrFndConcurrent

     3.65s  4.08% 50.18%      3.65s  4.08%  
mobiledatabooks.com/biggrep.Benchmark_100MB_End__20charsBigGrepScan__Concurrent

     3.04s  3.40% 53.58%      3.04s  3.40%  
mobiledatabooks.com/biggrep.Benchmark_100MB_End__20charsBigGrepBytes_Concurrent


----


----

Type: alloc_space

Time: Jun 8, 2022 at 10:17pm (PDT)

Active filters:

   focus=Benchmark

   hide=benchm

   show=Benchmark

Showing nodes accounting for 533.60GB, 100% of 533.61GB total

      flat  flat%   sum%        cum   cum%

   72.96GB 13.67% 13.67%    72.96GB 13.67%  
mobiledatabooks.com/biggrep.Benchmark_100MB_End__20charsBigGrepStrFnd

   72.95GB 13.67% 27.34%    72.95GB 13.67%  
mobiledatabooks.com/biggrep.Benchmark_100MB_End__20charsBigGrepStr___Concurrent

   72.94GB 13.67% 41.01%    72.94GB 13.67%  
mobiledatabooks.com/biggrep.Benchmark_100MB_End__20charsBigGrepStr___

   72.93GB 13.67% 54.68%    72.93GB 13.67%  
mobiledatabooks.com/biggrep.Benchmark_100MB_End__20charsBigGrepStrFndConcurrent

   60.48GB 11.33% 66.01%    60.48GB 11.33%  
mobiledatabooks.com/biggrep.Benchmark_100MB_End__20charsBigGrepBytes_

   60.47GB 11.33% 77.35%    60.47GB 11.33%  
mobiledatabooks.com/biggrep.Benchmark_100MB_End__20charsBigGrepBytes_Concurrent

   60.45GB 11.33% 88.67%    60.45GB 11.33%  
mobiledatabooks.com/biggrep.Benchmark_100MB_End__20charsBigGrepScan__Concurrent

   60.43GB 11.33%   100%    60.43GB 11.33%  
mobiledatabooks.com/biggrep.Benchmark_100MB_End__20charsBigGrepScan__


----3.82s
On Sunday, May 8, 2022 at 9:36:34 PM UTC-7 Const V wrote:

> The program is working fine with STDOUT and there is no need of  program > 
> /dev/null.
>
> I was using the Pipe for the unit tests with _test.go to catch the STDOUT.
> How you will write your function as you mention it?
>
> On Sunday, May 8, 2022 at 9:05:51 PM UTC-7 Amnon wrote:
>
>> So what happens when you run your program > /dev/null
>> ?
>>
>> For testing I would write a function that reads from an io.Reader and 
>> writes to an io.Reader.
>> Write a unit test which uses a bytes.Buffer to catch the output.
>>
>> On Monday, 9 May 2022 at 04:59:37 UTC+1 Const V wrote:
>>
>>> I'm using OSX. 
>>>
>>> The only reasonI need to redirect is to catch the STOUT output in a 
>>> string for testing,
>>> It seems Pipe has limited capacity. May be there is another way.
>>>
>>> On Sunday, May 8, 2022 at 8:33:09 PM UTC-7 Amnon wrote:
>>>
>>>>
>>>> Why don't you try redirecting stdout to /dev/null and see how your 
>>>> program behaves.
>>>>
>>>> Also, which OS are you using?
>>>>
>>>> On Sun, May 8, 2022 at 11:36 PM Const V <ths...@gmail.com> wrote:
>>>>
>>>>> reading 1 line '\n' delimited 100MB file 
>>>>> r1 := bufio.NewReader(file)
>>>>> s := ReadWithReadLine(r1)
>>>>> InputProcessing(strings.NewReader(s), os.Stdout)
>>>>>
>>>>> in InputProcessing"
>>>>> w.Write([]byte(s)) -> waiting forever
>>>>> w.Write([]byte(s)[:100]) -> working
>>>>> ---
>>>>> func InputProcessing(r io.Reader, w io.Writer) {
>>>>> find := "error"
>>>>> /////////////////////////////////
>>>>> s := ReadWithReadLine(r)
>>>>> if strings.Contains(s, find) {
>>>>> w.Write([]byte(s)[:100])
>>>>> w.Write([]byte("\n"))
>>>>> } else {
>>>>> w.Write([]byte(" \n"))
>>>>> }
>>>>> }
>>>>> ---
>>>>> On Sunday, May 8, 2022 at 3:21:52 PM UTC-7 Amnon wrote:
>>>>>
>>>>>> On Sun, May 8, 2022 at 10:41 PM Const V <ths...@gmail.com> wrote:
>>>>>>
>>>>>>> write to stdout is not working for MB long strings
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>> That is very surprising indeed.
>>>>>> How do you reach the conclusion?
>>>>>> How can we replicate that failure?
>>>>>>
>>>>> -- 
>>>>> You received this message because you are subscribed to a topic in the 
>>>>> Google Groups "golang-nuts" group.
>>>>> To unsubscribe from this topic, visit 
>>>>> https://groups.google.com/d/topic/golang-nuts/IUMIxWx9aLk/unsubscribe.
>>>>> To unsubscribe from this group and all its topics, send an email to 
>>>>> golang-nuts...@googlegroups.com.
>>>>> To view this discussion on the web visit 
>>>>> https://groups.google.com/d/msgid/golang-nuts/b1259910-58ca-4a57-bb05-1dde2fc73443n%40googlegroups.com
>>>>>  
>>>>> <https://groups.google.com/d/msgid/golang-nuts/b1259910-58ca-4a57-bb05-1dde2fc73443n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/33b29512-da79-4dab-9e37-99270c585175n%40googlegroups.com.

Reply via email to