details...

celeste:gg mtj$ ls -l ~/corpora/go/corpusZ.tar
-rw-r--r--  1 mtj  staff  92386304 Jul  4 02:27
/Users/mtj/corpora/go/corpusZ.tar

this is the Go corpus compressed in chunks and then gathered as a tar file.
92MB at zstd #19, original size is 752ā€Æ311ā€Æ514 bytes. (Zstd compresses some
large Go files 50:1)

Here is looking for strings in that tar's unarchived, decompressed, Go code
using gg (https://github.com/MichaelTJones/gg):

celeste:gg mtj$ gg -summary -log log -digits s pizza
~/corpora/go/corpusZ.tar
/Users/mtj/corpora/go/corpusZ.tar::blob_000022.go: add(pair("blake", "eats
pizza"))
/Users/mtj/corpora/go/corpusZ.tar::blob_000022.go: t.Fatalf("after pizza,
size = %d; want %d", d.dynTab.size, want)
/Users/mtj/corpora/go/corpusZ.tar::blob_000022.go: "pizza",
/Users/mtj/corpora/go/corpusZ.tar::blob_000022.go: "pizza",
/Users/mtj/corpora/go/corpusZ.tar::blob_000022.go:
"piwatepizzapkonskowolayangroupharmacyshirahamatonbetsurgutsiracu" +
/Users/mtj/corpora/go/corpusZ.tar::blob_000031.go: data:   `{"tags":
[{"list": [{"one":"pizza"}]}]}`,
/Users/mtj/corpora/go/corpusZ.tar::blob_000031.go: output: "pizza",
/Users/mtj/corpora/go/corpusZ.tar::blob_000065.go:
"raskaunbieidsvollpiwatepizzapkosaigawaplanetariuminanoplantation" +
/Users/mtj/corpora/go/corpusZ.tar::blob_000099.go: add(pair("blake", "eats
pizza"))
/Users/mtj/corpora/go/corpusZ.tar::blob_000099.go: t.Fatalf("after pizza,
size = %d; want %d", d.dynTab.size, want)
/Users/mtj/corpora/go/corpusZ.tar::blob_000099.go: "pizza",
/Users/mtj/corpora/go/corpusZ.tar::blob_000099.go: "pizza",
/Users/mtj/corpora/go/corpusZ.tar::blob_000099.go:
"piwatepizzapkoninjamisonplanetariuminnesotaketakayamatsumaebashi" +
/Users/mtj/corpora/go/corpusZ.tar::blob_000137.go: {"test :\n```bash\nthis
is a test\n```\n\ntest\n\n:cool::blush:::pizza:\\:blush : : blush:
:pizza:", []byte("test :\n```bash\nthis is a
test\n```\n\ntest\n\nšŸ†’šŸ˜Š:šŸ•\\:blush : : blush: šŸ•")},
/Users/mtj/corpora/go/corpusZ.tar::blob_000137.go: ":pizza:":
                 "\U0001f355",
/Users/mtj/corpora/go/corpusZ.tar::blob_000137.go: pizzaMessage :=
emoji.Sprint("I like a :pizza: and :sushi:!!")
performance
  grep  16 matches
  work  752ā€Æ311ā€Æ514 bytes, 170ā€Æ302ā€Æ873 tokens, 22ā€Æ927ā€Æ078 lines, 176 files
  time  3.975375 sec elapsed, 25.803189 sec user + 0.985283 system
  rate  189ā€Æ242ā€Æ887 bytes/sec, 42ā€Æ839ā€Æ444 tokens/sec, 5ā€Æ767ā€Æ273 lines/sec,
44 files/sec
  cpus  7 workers (parallel speedup = 6.74x)
celeste:gg mtj$

This rate, 189 mb/sec, is on a 4 core 8 thread macbook pro and is the
average over the whole corpus, so reflects not just lexical tokenization of
752 MB of Go code, grep-like regular expression matching for "pizza", but
also Klaus' Go-only, native Zstandard decompression, which transforms the
92 MB tar file to the 752 MB of code. Here are timing numbers, per file,
for Zstd expansion: (gg -log log)

celeste:gg mtj$ cat log
2019/07/11 10:05:31.433048 scan begins
2019/07/11 10:05:31.433240 processing files listed on command line
2019/07/11 10:05:31.433324 processing tar archive
/Users/mtj/corpora/go/corpusZ.tar
2019/07/11 10:05:31.471641     206993 ā†’  4023123 bytes (19.436Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000005.go.zst
2019/07/11 10:05:31.475658      97312 ā†’  4889323 bytes (50.244Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000003.go.zst
2019/07/11 10:05:31.478435     588144 ā†’  4058916 bytes ( 6.901Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000000.go.zst
2019/07/11 10:05:31.478751     252664 ā†’  4343752 bytes (17.192Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000001.go.zst
2019/07/11 10:05:31.483804     254045 ā†’  4245209 bytes (16.710Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000006.go.zst
2019/07/11 10:05:31.483819      93940 ā†’  4666515 bytes (49.675Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000004.go.zst
2019/07/11 10:05:31.484640     212543 ā†’  5266751 bytes (24.780Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000002.go.zst
2019/07/11 10:05:31.596666     796173 ā†’  5097981 bytes ( 6.403Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000012.go.zst
2019/07/11 10:05:31.605878     222357 ā†’  4149988 bytes (18.664Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000013.go.zst
2019/07/11 10:05:31.606986     115654 ā†’  5244204 bytes (45.344Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000008.go.zst
2019/07/11 10:05:31.621446     202508 ā†’  4353542 bytes (21.498Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000010.go.zst
2019/07/11 10:05:31.624954     326016 ā†’  4010657 bytes (12.302Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000011.go.zst
2019/07/11 10:05:31.632554     207580 ā†’  4065186 bytes (19.584Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000009.go.zst
2019/07/11 10:05:31.645472     174997 ā†’  4008262 bytes (22.905Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000007.go.zst
2019/07/11 10:05:31.740442     426643 ā†’  4001493 bytes ( 9.379Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000019.go.zst
2019/07/11 10:05:31.744056     264530 ā†’  4046021 bytes (15.295Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000016.go.zst
2019/07/11 10:05:31.745338     627327 ā†’  4004365 bytes ( 6.383Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000017.go.zst
2019/07/11 10:05:31.751263     258532 ā†’  4042393 bytes (15.636Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000018.go.zst
2019/07/11 10:05:31.751756     471019 ā†’  4008325 bytes ( 8.510Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000015.go.zst
2019/07/11 10:05:31.760835     379616 ā†’  4320295 bytes (11.381Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000014.go.zst
2019/07/11 10:05:31.802144     556027 ā†’  4003356 bytes ( 7.200Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000020.go.zst
2019/07/11 10:05:31.901767     372064 ā†’  4193635 bytes (11.271Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000024.go.zst
2019/07/11 10:05:31.909306     397412 ā†’  4554722 bytes (11.461Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000023.go.zst
2019/07/11 10:05:31.916362    1248756 ā†’  8379069 bytes ( 6.710Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000026.go.zst
2019/07/11 10:05:31.932392     770723 ā†’  4808476 bytes ( 6.239Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000022.go.zst
2019/07/11 10:05:31.970928     224452 ā†’  4027062 bytes (17.942Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000025.go.zst
2019/07/11 10:05:32.002406     651210 ā†’  4400457 bytes ( 6.757Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000027.go.zst
2019/07/11 10:05:32.004718    1231592 ā†’  4316079 bytes ( 3.504Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000021.go.zst
2019/07/11 10:05:32.087856     353702 ā†’  4029909 bytes (11.394Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000031.go.zst
2019/07/11 10:05:32.105936     848426 ā†’  4039351 bytes ( 4.761Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000030.go.zst
2019/07/11 10:05:32.125037     650697 ā†’  4003712 bytes ( 6.153Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000029.go.zst
2019/07/11 10:05:32.149261     518660 ā†’  4008953 bytes ( 7.729Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000032.go.zst
2019/07/11 10:05:32.157127     654288 ā†’  4007926 bytes ( 6.126Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000033.go.zst
2019/07/11 10:05:32.183090     896938 ā†’  4703274 bytes ( 5.244Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000028.go.zst
2019/07/11 10:05:32.213361     744952 ā†’  4053128 bytes ( 5.441Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000034.go.zst
2019/07/11 10:05:32.248550     491073 ā†’  4008267 bytes ( 8.162Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000038.go.zst
2019/07/11 10:05:32.265478     680203 ā†’  4001134 bytes ( 5.882Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000037.go.zst
2019/07/11 10:05:32.271071     734833 ā†’  4000138 bytes ( 5.444Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000036.go.zst
2019/07/11 10:05:32.284729     360922 ā†’  4005048 bytes (11.097Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000039.go.zst
2019/07/11 10:05:32.303961     308798 ā†’  4018343 bytes (13.013Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000035.go.zst
2019/07/11 10:05:32.306066     391388 ā†’  4010727 bytes (10.247Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000040.go.zst
2019/07/11 10:05:32.362341     484355 ā†’  4001344 bytes ( 8.261Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000045.go.zst
2019/07/11 10:05:32.380463     375405 ā†’  4000209 bytes (10.656Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000041.go.zst
2019/07/11 10:05:32.417273     358429 ā†’  4046348 bytes (11.289Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000044.go.zst
2019/07/11 10:05:32.420960     375823 ā†’  4378995 bytes (11.652Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000043.go.zst
2019/07/11 10:05:32.444477     730772 ā†’  4246251 bytes ( 5.811Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000046.go.zst
2019/07/11 10:05:32.459183     604840 ā†’  4027513 bytes ( 6.659Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000042.go.zst
2019/07/11 10:05:32.480410     518876 ā†’  4008279 bytes ( 7.725Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000047.go.zst
2019/07/11 10:05:32.511231     330245 ā†’  4667908 bytes (14.135Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000048.go.zst
2019/07/11 10:05:32.538113     641796 ā†’  4017023 bytes ( 6.259Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000052.go.zst
2019/07/11 10:05:32.587558     686143 ā†’  4358371 bytes ( 6.352Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000053.go.zst
2019/07/11 10:05:32.592458    1011230 ā†’  4002547 bytes ( 3.958Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000051.go.zst
2019/07/11 10:05:32.597859     627406 ā†’  4003373 bytes ( 6.381Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000050.go.zst
2019/07/11 10:05:32.605734     502047 ā†’  4009282 bytes ( 7.986Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000049.go.zst
2019/07/11 10:05:32.655583     586706 ā†’  4066973 bytes ( 6.932Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000054.go.zst
2019/07/11 10:05:32.679516     620595 ā†’  4002107 bytes ( 6.449Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000059.go.zst
2019/07/11 10:05:32.687838     369297 ā†’  4001749 bytes (10.836Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000058.go.zst
2019/07/11 10:05:32.691284     408358 ā†’  4008987 bytes ( 9.817Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000055.go.zst
2019/07/11 10:05:32.744006     636283 ā†’  4028244 bytes ( 6.331Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000060.go.zst
2019/07/11 10:05:32.768291     277219 ā†’  4002291 bytes (14.437Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000057.go.zst
2019/07/11 10:05:32.781058     557989 ā†’  4005679 bytes ( 7.179Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000056.go.zst
2019/07/11 10:05:32.822744     414283 ā†’  4269034 bytes (10.305Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000061.go.zst
2019/07/11 10:05:32.835958     468418 ā†’  4059427 bytes ( 8.666Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000066.go.zst
2019/07/11 10:05:32.847792     608908 ā†’  4014328 bytes ( 6.593Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000062.go.zst
2019/07/11 10:05:32.888771     690474 ā†’  4000820 bytes ( 5.794Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000065.go.zst
2019/07/11 10:05:32.897517     521619 ā†’  4150632 bytes ( 7.957Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000067.go.zst
2019/07/11 10:05:32.921717     680148 ā†’  4000246 bytes ( 5.881Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000064.go.zst
2019/07/11 10:05:32.956136     639151 ā†’  4000706 bytes ( 6.259Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000063.go.zst
2019/07/11 10:05:32.962578     471848 ā†’  4040865 bytes ( 8.564Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000073.go.zst
2019/07/11 10:05:32.991789     489327 ā†’  4005132 bytes ( 8.185Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000068.go.zst
2019/07/11 10:05:33.020254     523091 ā†’  4004755 bytes ( 7.656Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000069.go.zst
2019/07/11 10:05:33.025607     450401 ā†’  4083159 bytes ( 9.066Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000072.go.zst
2019/07/11 10:05:33.029834     520708 ā†’  4007521 bytes ( 7.696Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000074.go.zst
2019/07/11 10:05:33.050782     378552 ā†’  4001925 bytes (10.572Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000071.go.zst
2019/07/11 10:05:33.118232     269084 ā†’  4009476 bytes (14.900Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000080.go.zst
2019/07/11 10:05:33.138117     592627 ā†’  4526083 bytes ( 7.637Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000075.go.zst
2019/07/11 10:05:33.140114     738182 ā†’  4008928 bytes ( 5.431Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000070.go.zst
2019/07/11 10:05:33.167534     663139 ā†’  4001954 bytes ( 6.035Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000079.go.zst
2019/07/11 10:05:33.177132     516721 ā†’  4001675 bytes ( 7.744Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000081.go.zst
2019/07/11 10:05:33.189430     499878 ā†’  4000568 bytes ( 8.003Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000078.go.zst
2019/07/11 10:05:33.216166     308152 ā†’  4067621 bytes (13.200Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000076.go.zst
2019/07/11 10:05:33.249731     582032 ā†’  4394445 bytes ( 7.550Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000082.go.zst
2019/07/11 10:05:33.269010     364537 ā†’  4396798 bytes (12.061Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000087.go.zst
2019/07/11 10:05:33.283364     184365 ā†’  4373843 bytes (23.724Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000088.go.zst
2019/07/11 10:05:33.287415     712220 ā†’  4055369 bytes ( 5.694Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000077.go.zst
2019/07/11 10:05:33.321402     751294 ā†’  4003781 bytes ( 5.329Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000086.go.zst
2019/07/11 10:05:33.324074     354366 ā†’  4002277 bytes (11.294Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000085.go.zst
2019/07/11 10:05:33.334724     716464 ā†’  4006384 bytes ( 5.592Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000094.go.zst
2019/07/11 10:05:33.355151     363508 ā†’  4001510 bytes (11.008Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000083.go.zst
2019/07/11 10:05:33.400834     598521 ā†’  4620592 bytes ( 7.720Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000095.go.zst
2019/07/11 10:05:33.409613     198910 ā†’  4094836 bytes (20.586Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000089.go.zst
2019/07/11 10:05:33.435069     386309 ā†’  4002560 bytes (10.361Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000084.go.zst
2019/07/11 10:05:33.445655     184047 ā†’  4364073 bytes (23.712Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000090.go.zst
2019/07/11 10:05:33.457476     541657 ā†’  4290849 bytes ( 7.922Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000092.go.zst
2019/07/11 10:05:33.470683     868726 ā†’  4003490 bytes ( 4.608Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000093.go.zst
2019/07/11 10:05:33.495689    1260317 ā†’  5556243 bytes ( 4.409Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000101.go.zst
2019/07/11 10:05:33.519851     579813 ā†’  4005688 bytes ( 6.909Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000096.go.zst
2019/07/11 10:05:33.543813     672782 ā†’  4002577 bytes ( 5.949Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000102.go.zst
2019/07/11 10:05:33.551926     566709 ā†’  4006434 bytes ( 7.070Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000097.go.zst
2019/07/11 10:05:33.571590     569324 ā†’  4041363 bytes ( 7.099Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000100.go.zst
2019/07/11 10:05:33.583279     200062 ā†’  5051258 bytes (25.248Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000091.go.zst
2019/07/11 10:05:33.606004     472200 ā†’  4024953 bytes ( 8.524Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000099.go.zst
2019/07/11 10:05:33.635531     575127 ā†’  4001741 bytes ( 6.958Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000108.go.zst
2019/07/11 10:05:33.686650     534639 ā†’  4012273 bytes ( 7.505Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000109.go.zst
2019/07/11 10:05:33.695229    1285860 ā†’  5272770 bytes ( 4.101Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000103.go.zst
2019/07/11 10:05:33.720187     412864 ā†’  4049274 bytes ( 9.808Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000107.go.zst
2019/07/11 10:05:33.720982     605166 ā†’  4005251 bytes ( 6.618Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000098.go.zst
2019/07/11 10:05:33.737777     726153 ā†’  4001435 bytes ( 5.510Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000104.go.zst
2019/07/11 10:05:33.755917     799097 ā†’  4000015 bytes ( 5.006Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000106.go.zst
2019/07/11 10:05:33.773795     422867 ā†’  4000786 bytes ( 9.461Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000115.go.zst
2019/07/11 10:05:33.810990     393810 ā†’  4001894 bytes (10.162Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000110.go.zst
2019/07/11 10:05:33.847795     549097 ā†’  4457876 bytes ( 8.119Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000114.go.zst
2019/07/11 10:05:33.863927     271390 ā†’  4008218 bytes (14.769Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000105.go.zst
2019/07/11 10:05:33.867743     233106 ā†’  4000444 bytes (17.161Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000111.go.zst
2019/07/11 10:05:33.867836     562574 ā†’  4003299 bytes ( 7.116Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000116.go.zst
2019/07/11 10:05:33.911283     311625 ā†’  4018660 bytes (12.896Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000112.go.zst
2019/07/11 10:05:33.925529     694084 ā†’  4509171 bytes ( 6.497Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000113.go.zst
2019/07/11 10:05:33.929689     455910 ā†’  4000240 bytes ( 8.774Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000122.go.zst
2019/07/11 10:05:33.943061     568708 ā†’  4969481 bytes ( 8.738Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000117.go.zst
2019/07/11 10:05:33.999025     701356 ā†’  4041892 bytes ( 5.763Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000121.go.zst
2019/07/11 10:05:34.019081     947328 ā†’  5271725 bytes ( 5.565Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000123.go.zst
2019/07/11 10:05:34.046375     503850 ā†’  4000020 bytes ( 7.939Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000118.go.zst
2019/07/11 10:05:34.067051     264587 ā†’  4053813 bytes (15.321Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000120.go.zst
2019/07/11 10:05:34.088091     531199 ā†’  4705989 bytes ( 8.859Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000129.go.zst
2019/07/11 10:05:34.091944     502142 ā†’  4010187 bytes ( 7.986Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000119.go.zst
2019/07/11 10:05:34.121482     416870 ā†’  4002183 bytes ( 9.601Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000124.go.zst
2019/07/11 10:05:34.126305     503153 ā†’  4010237 bytes ( 7.970Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000130.go.zst
2019/07/11 10:05:34.137981     573336 ā†’  4029106 bytes ( 7.027Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000128.go.zst
2019/07/11 10:05:34.203384     464844 ā†’  4022011 bytes ( 8.652Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000127.go.zst
2019/07/11 10:05:34.209802     434146 ā†’  4005918 bytes ( 9.227Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000125.go.zst
2019/07/11 10:05:34.226746     562023 ā†’  4036194 bytes ( 7.182Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000126.go.zst
2019/07/11 10:05:34.253277     534846 ā†’  4298152 bytes ( 8.036Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000136.go.zst
2019/07/11 10:05:34.283738     474416 ā†’  4015827 bytes ( 8.465Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000135.go.zst
2019/07/11 10:05:34.297443     676273 ā†’  4017143 bytes ( 5.940Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000137.go.zst
2019/07/11 10:05:34.350197     485533 ā†’  4003784 bytes ( 8.246Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000134.go.zst
2019/07/11 10:05:34.367692     558522 ā†’  4012951 bytes ( 7.185Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000133.go.zst
2019/07/11 10:05:34.379223    2771127 ā†’ 13048604 bytes ( 4.709Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000131.go.zst
2019/07/11 10:05:34.385873    2528170 ā†’ 10905639 bytes ( 4.314Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000132.go.zst
2019/07/11 10:05:34.420100     309923 ā†’  4001685 bytes (12.912Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000143.go.zst
2019/07/11 10:05:34.440169     376289 ā†’  4028740 bytes (10.707Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000142.go.zst
2019/07/11 10:05:34.447682     316988 ā†’  4001667 bytes (12.624Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000144.go.zst
2019/07/11 10:05:34.496823     290607 ā†’  4002723 bytes (13.774Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000141.go.zst
2019/07/11 10:05:34.522682     434315 ā†’  4018276 bytes ( 9.252Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000150.go.zst
2019/07/11 10:05:34.529963     658292 ā†’  4000448 bytes ( 6.077Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000140.go.zst
2019/07/11 10:05:34.543966     279896 ā†’  4110833 bytes (14.687Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000151.go.zst
2019/07/11 10:05:34.572687    1086021 ā†’  4494390 bytes ( 4.138Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000149.go.zst
2019/07/11 10:05:34.576312     611655 ā†’  4095210 bytes ( 6.695Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000138.go.zst
2019/07/11 10:05:34.626874     317859 ā†’  4172019 bytes (13.125Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000148.go.zst
2019/07/11 10:05:34.638622     376698 ā†’  4003119 bytes (10.627Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000157.go.zst
2019/07/11 10:05:34.639242     533870 ā†’  4003901 bytes ( 7.500Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000139.go.zst
2019/07/11 10:05:34.690782     274293 ā†’  4083501 bytes (14.887Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000147.go.zst
2019/07/11 10:05:34.702534     744282 ā†’  4006230 bytes ( 5.383Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000156.go.zst
2019/07/11 10:05:34.723371     291584 ā†’  4168844 bytes (14.297Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000145.go.zst
2019/07/11 10:05:34.729076     422603 ā†’  4006953 bytes ( 9.482Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000155.go.zst
2019/07/11 10:05:34.731728     711141 ā†’  4008574 bytes ( 5.637Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000158.go.zst
2019/07/11 10:05:34.775064     603118 ā†’  4136079 bytes ( 6.858Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000163.go.zst
2019/07/11 10:05:34.792702     340353 ā†’  4424067 bytes (12.998Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000164.go.zst
2019/07/11 10:05:34.796162     618352 ā†’  4002249 bytes ( 6.472Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000154.go.zst
2019/07/11 10:05:34.820524     289494 ā†’  4036048 bytes (13.942Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000146.go.zst
2019/07/11 10:05:34.820605     309016 ā†’  5260095 bytes (17.022Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000152.go.zst
2019/07/11 10:05:34.865440     131045 ā†’  4280066 bytes (32.661Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000165.go.zst
2019/07/11 10:05:34.870480     406615 ā†’  4000797 bytes ( 9.839Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000162.go.zst
2019/07/11 10:05:34.918977     252124 ā†’  4012267 bytes (15.914Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000153.go.zst
2019/07/11 10:05:34.927295     456129 ā†’  4042538 bytes ( 8.863Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000161.go.zst
2019/07/11 10:05:34.942748     717740 ā†’  4319130 bytes ( 6.018Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000170.go.zst
2019/07/11 10:05:35.011325     731744 ā†’  4013175 bytes ( 5.484Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000171.go.zst
2019/07/11 10:05:35.040012     565284 ā†’  4163041 bytes ( 7.365Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000169.go.zst
2019/07/11 10:05:35.062413     289918 ā†’  4000001 bytes (13.797Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000159.go.zst
2019/07/11 10:05:35.066697     703842 ā†’  4000001 bytes ( 5.683Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000160.go.zst
2019/07/11 10:05:35.075955     190967 ā†’  4197565 bytes (21.981Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000168.go.zst
2019/07/11 10:05:35.085458     290878 ā†’  4010780 bytes (13.789Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000172.go.zst
2019/07/11 10:05:35.170622      97265 ā†’  4243884 bytes (43.632Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000166.go.zst
2019/07/11 10:05:35.184665     171285 ā†’  4069187 bytes (23.757Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000167.go.zst
2019/07/11 10:05:35.224262     534458 ā†’  2880005 bytes ( 5.389Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000175.go.zst
2019/07/11 10:05:35.312594     619070 ā†’  4002126 bytes ( 6.465Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000174.go.zst
2019/07/11 10:05:35.313823     518866 ā†’  4350379 bytes ( 8.384Ɨ)
 decompress and scan /Users/mtj/corpora/go/corpusZ.tar::blob_000173.go.zst
2019/07/11 10:05:35.408361 scan ends
2019/07/11 10:05:35.408391 performance
2019/07/11 10:05:35.408400   grep  16 matches
2019/07/11 10:05:35.408408   work  752ā€Æ311ā€Æ514 bytes, 170ā€Æ302ā€Æ873 tokens,
22ā€Æ927ā€Æ078 lines, 176 files
2019/07/11 10:05:35.408417   time  3.975375 sec elapsed, 25.803189 sec user
+ 0.985283 system
2019/07/11 10:05:35.408424   rate  189ā€Æ242ā€Æ887 bytes/sec, 42ā€Æ839ā€Æ444
tokens/sec, 5ā€Æ767ā€Æ273 lines/sec, 44 files/sec
2019/07/11 10:05:35.408431   cpus  7 workers (parallel speedup = 6.74x)
celeste:gg mtj$

There is no doubt that Zstd is a brilliance, nor that Klaus' library is
excellent.

On Thu, Jul 11, 2019 at 9:55 AM Aliaksandr Valialkin <valy...@gmail.com>
wrote:

>
>
> On Thu, Jul 11, 2019 at 7:29 PM Michael Jones <michael.jo...@gmail.com>
> wrote:
>
>> I use Klaus' library to decompress ~GiB files that have been compressed
>> by zstd command line (c/c++ code) at level 19. Works great.
>>
>
> Thanks for sharing this information!
>
>
>> On Thu, Jul 11, 2019 at 9:10 AM Klaus Post <klausp...@gmail.com> wrote:
>>
>>> On Thursday, 11 July 2019 17:37:09 UTC+2, Aliaksandr Valialkin wrote:
>>>>
>>>>
>>>>
>>>> This is really great idea! Will try implementing it.
>>>>
>>>> Does github.com/klauspost/compress support all the levels for data
>>>> decompression? VictoriaMetrics varies compression level depending on the
>>>> data type. It would be great if github.com/klauspost/compress could
>>>> decompress data compressed by the upstream zstd on higher compression
>>>> levels.
>>>>
>>>
>>> Decompression will work for all input. It is implementing the full spec.
>>>
>>
> Great! I filed feature request for implementing pure Go builds for
> VictoriaMetrics -
> https://github.com/VictoriaMetrics/VictoriaMetrics/issues/94 .
>
>
>>
>>> Compression has "Fastest" and "Default" implemented, roughly matching
>>> level 1 and 3 in zstd in speed and performance. I plan on adding something
>>> around level 7-9 (as Better) and level 19 (as Best). But for it to be
>>> useful I have mainly focused on the fastest modes. I also am planning more
>>> concurrency in compression and decompression for streams - blocks will
>>> probably remain as single goroutines. For now I am taking a small break and
>>> having a bit of fun revisiting deflate and experimenting with Snappy.
>>>
>>> If there is anything I can do to help feel free to ping me.
>>>
>>>
>>> /Klaus
>>>
>>>
>>>>
>>>> --
>>>> Best Regards,
>>>>
>>>> Aliaksandr
>>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "golang-nuts" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to golang-nuts+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/golang-nuts/b12c7562-b3a6-426b-bb1c-a62fcfc41714%40googlegroups.com
>>> <https://groups.google.com/d/msgid/golang-nuts/b12c7562-b3a6-426b-bb1c-a62fcfc41714%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>> --
>>
>> *Michael T. jonesmichael.jo...@gmail.com <michael.jo...@gmail.com>*
>>
>> --
>> You received this message because you are subscribed to a topic in the
>> Google Groups "golang-nuts" group.
>> To unsubscribe from this topic, visit
>> https://groups.google.com/d/topic/golang-nuts/onlD1GIG00g/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
>> golang-nuts+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/golang-nuts/CALoEmQwocTkYXf7bn39mxpkhuF%2Bynogb8BC6YwzXa9%3Dj89%3DvQw%40mail.gmail.com
>> <https://groups.google.com/d/msgid/golang-nuts/CALoEmQwocTkYXf7bn39mxpkhuF%2Bynogb8BC6YwzXa9%3Dj89%3DvQw%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
> --
> Best Regards,
>
> Aliaksandr
>


-- 

*Michael T. jonesmichael.jo...@gmail.com <michael.jo...@gmail.com>*

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CALoEmQxLZiPEiiYkMyVeLXweVqHEzbFiZJSUpZE_fJXomgj%3DXQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to