I cleaned up the code and added more commments. It should be much easier now to uderstand what/why the code is doing after reading my paper.
Here is the gist location of the source code for this version. [https://gist.github.com/jzakiya/94670e6144735eb0041919f633d6011c](https://gist.github.com/jzakiya/94670e6144735eb0041919f633d6011c) I've created a rearchitected version to run in parallel, and in fact I got it to compile, and run with correct results, but is 2x slower than the serial version. If anyone knows how to really create parallel running programs please let me know. I will create a github project for all the different versions I|others create, for different languages, prime generators, parallel versions, etc. So far I have various C++ versions, and now this Nim version, and I plan to do a Crystal version too. If people want to, you can contact me directly via email, which is included in my pape and the code intro. I would urge people who run the code to play with the B value for their system cache. I've only run this code on Intel I5|7 cpu laptops. I'm interested how the code runs on other systems (AMD, ARM, PowerPC, etc) and operarting systems (I've only used Linux). Here's the code in the gist file. #[ This Nim source file will compile to an executable program to perform the Segmented Sieve of Zakiya (SSoZ) to find primes <= N. It is based on the P5 Strictly Prime (SP) Prime Generator. Prime Genrators have the form: mod*k + ri; ri -> {1,r1..mod-1} The residues ri are integers coprime to mod, i.e. gcd(ri,mod) = 1 For P5, mod = 2*3*5 = 30 and the number of residues are rescnt = (2-1)(3-1)(5-1) = 8, which are {1,7,11,13,17,19,23,29}. Adjust segment byte length parameter B (usually L1|l2 cache size) for optimum operational performance for cpu|system being used. Verified on Nim 0.17, using clang (smaller exec) or gcc $ nim c --cc:[clang|gcc] --d:release ssozp5.nim Then run executable: $ ./ssozp5 <cr>, and enter value for N. As coded, input values cover the range: 7 -- 2^64-1 Related code, papers, and tutorials, are downloadable here: http://www.4shared.com/folder/TcMrUvTB/_online.html Use of this code is free subject to acknowledgment of copyright. Copyright (c) 2017 Jabari Zakiya -- jzakiya at gmail dot com Version Date: 2017/08/23 This code is provided under the terms of the GNU General Public License Version 3, GPLv3, or greater. License copy/terms are here: http://www.gnu.org/licenses/ ]# import math # for sqrt function import strutils, typetraits # for number input # Global var used to count number of primes in 'seg' variable # Each value is number of '0' bits (primes) for values 0..255 let pbits = [ 8,7,7,6,7,6,6,5,7,6,6,5,6,5,5,4,7,6,6,5,6,5,5,4,6,5,5,4,5,4,4,3 ,7,6,6,5,6,5,5,4,6,5,5,4,5,4,4,3,6,5,5,4,5,4,4,3,5,4,4,3,4,3,3,2 ,7,6,6,5,6,5,5,4,6,5,5,4,5,4,4,3,6,5,5,4,5,4,4,3,5,4,4,3,4,3,3,2 ,6,5,5,4,5,4,4,3,5,4,4,3,4,3,3,2,5,4,4,3,4,3,3,2,4,3,3,2,3,2,2,1 ,7,6,6,5,6,5,5,4,6,5,5,4,5,4,4,3,6,5,5,4,5,4,4,3,5,4,4,3,4,3,3,2 ,6,5,5,4,5,4,4,3,5,4,4,3,4,3,3,2,5,4,4,3,4,3,3,2,4,3,3,2,3,2,2,1 ,6,5,5,4,5,4,4,3,5,4,4,3,4,3,3,2,5,4,4,3,4,3,3,2,4,3,3,2,3,2,2,1 ,5,4,4,3,4,3,3,2,4,3,3,2,3,2,2,1,4,3,3,2,3,2,2,1,3,2,2,1,2,1,1,0 ] # The global residue values for the P5 prime generator. let residues = [1, 7, 11, 13, 17, 19, 23, 29, 31] # Global parameters const modp5 = 30 # mod value for P5 prime generator rescnt = 8 # number of residues for P5 prime generator var pcnt = 0 # number of primes from r1..sqrt(N) primecnt = 0'u64 # number of primes <= N next: seq[uint64] # table of regroups vals for primes multiples primes: seq[int] # list of primes r1..sqrt(N) seg: seq[uint8] # segment byte array to perform ssoz # This routine is used to compute the list of primes r1..sqrt(input_num), # stored in global var 'primes', and its count stored in global var 'pcnt'. # Any algorithm (fast|small) can be used. Here the SoZ using P7 is used. proc sozp7(val: int | int64) = # Use P7 prime gen to finds upto val let md = 210 # P7 mod value let rscnt = 48 # P7 residue count and residues list let res = [1, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67 , 71, 73, 79, 83, 89, 97,101,103,107,109,113,121,127,131,137,139 ,143,149,151,157,163,167,169,173,179,181,187,191,193,197,199,209,211] var posn = newSeq[int](210) # small array (hash) to convert for i in 0 .. <rscnt: posn[res[i]] = i-1 # (x mod md) to values -1,0,1,2..6 let num = (val-1) or 1 # if val even then subtract 1 var k = num div md # compute its residue group val var modk = md * k # compute its base num var r = 1 # starting with first residue while num >= modk+res[r]: r += 1 # find last pc position <= num let maxprms = k*rscnt + r - 1 # maximum number of prime candidates var prms = newSeq[bool](maxprms) # array of prime candidates set False let sqrtN = int(sqrt float64(num)) modk = 0; r = 0; k = 0 # sieve to eliminate prime multiples from list of pcs r1..sqrtN for prm in prms: # from list of pc positions in prms r += 1; if r > rscnt: (r = 1; modk += md; k += 1) if prm: continue # if pc not prime go to next location let res_r = res[r] # if prime save residue of prime value let prime = modk + res_r # numerate the prime value if prime > sqrtN: break # exit if > sqrtN let prmstep = prime * rscnt # prime's stepsize to eliminate its mults for ri in res[1..rscnt]: # eliminate prime's multiples from prms let prod = res_r * ri # residues cross-product for this prime # compute resgroup val of 1st prime multiple for each restrack # starting there, mark all prime multiples on restrack upto end of prms var prm_mult = (k*(prime + ri) + prod div md)*rscnt + posn[prod mod md] while prm_mult < maxprms: prms[prm_mult] = true; prm_mult += prmstep # prms now contains the nonprime positions for the prime candidates r1..N # extract primes into global var 'primes' and count into global var 'pcnt' primes = @[7] # r1 prime for P5 modk = 0; r=0 for prm in prms: # numerate primes from processed pcs list r += 1; if r > rscnt: (r = 1; modk += md) if not prm: primes.add(modk + res[r]) # put prime in global 'primes' list pcnt = len(primes) # set global count of primes # This routine initializes the [rescnt x pcnt] global var 'next' table with the # resgroup values of the 1st prime multiples for each prime r1..sqrtN along the # restracks for the 8 P5 residues (7, 11..29, 31). 'next' is used to eliminate # prime multiples in 'seg' for each SSoZ segment, and is updated with new # resgroup values for each prime for use with subsequent segment iterations. proc next_init() = var pos = newSeq[int](modp5) # create small array (hash) for i in 0 .. <rescnt: pos[residues[i]] = i-1 # to convert an (x mod modp5) pos[1] = rescnt-1 # val into restrack val 0..7 # load 'next' with 1st prime multiples regroups vals along each residue track for j, prime in primes: # for each prime r1..sqrt(N) let k = uint((prime-2)) div uint(modp5) # find the resgroup it's in let r = uint((prime-2)) mod uint(modp5) + 2 # and its residue value for ri in residues[1 .. rescnt]: # for each prime|residue pair let prod: int = int(r) * ri # compute res cross-product r*ri let row: int = pos[prod mod modp5] * pcnt # compute residue track address # compute|store resgroup val of 1st prime multiple for prime|ri residue pair next[row + j] = k*(uint(prime) + uint(ri)) + uint(prod-2) div uint(modp5) # This routine performs the segment sieve for a segment of Kn resgroups|bytes. # Each 'seg' bit represents a residue track of Kn resgroups, which are processed # sequentially. 'next' resgroup vals are used to mark prime multiples in 'seg' along # each restrack, and is udpated for each prime for the next segment. Upon completion # the number of primes ('0' bits) per 'seg' byte are added to global var 'primecnt'. proc segsieve(Kn: int) = # for Kn resgroups in segment for b in 0 .. <Kn: seg[b] = 0 # initialize seg bits to all prime '0' for r in 0 .. <rescnt: # for each ith (of 8) residues for P5 let biti = uint8(1 shl r) # set the ith residue track bit mask let row = r * pcnt # set address to ith row in next[] for j, prime in primes: # for each prime r1..sqrt(N) if next[row + j] < uint(Kn): # if 1st mult resgroup index <= seg size var k = int(next[row + j]) # get its resgroup value while k < Kn: # for each primenth byte < segment size seg[k] = seg[k] or biti # set resgroup's restrack bit to '1' (nonprime) k += int(prime) # compute next prime multiple resgroup next[row + j] = uint(k - Kn) # 1st resgroup in next eligible segment else: next[row + j] -= uint(Kn) # do if 1st mult resgroup index > seg size # count the primes in the segment for byt in seg[0..<Kn]: # for the Kn resgroup bytes primecnt += uint(pbits[byt]) # count the '0' bits as primes # Print segment primes. Can pipe output as: $ ./ssozp5 | less (or output file) # to make viewing primes easier proc printprms(Kn: int, Ki: uint64) = # print out prime values in segment for k in 0 .. <Kn: # for Kn residues groups|bytes for r in 0 .. <8: # for each restrack bit in byte if (int(seg[k]) and (1 shl r)) == 0: # if it's '0' it's a prime write(stdout, " ", uint64(modp5)*(Ki+uint64(k)) + uint64(residues[r+1])) #echo "\n" proc ssozp5() = # Perform SSoZ using P5 prime generator stdout.write "Enter integer number: " let val = stdin.readline.parseBiggestUInt # input integer val (7..2^64-1) const B = 256 * 1024 # adjust size to optimize for system's cache let KB = B # number of segment resgroups seg = newSeq[uint8](B) # create segment array of B bytes size echo("segment has ", B, " bytes and ", KB, " residues groups") let num = uint64((val-1) or 1) # if val even subtract 1 var k: uint64 = num div modp5 # compute its resgroup val var modk = uint64(modp5) * k # compute its base num var r = 1 # starting with first residue while num >= modk+uint64(residues[r]): r += 1 # find last pc position <= num let maxpcs = k * uint64(rescnt) + uint64(r-1) # maximum number of prime candidates let Kmax = uint64(num-2) div uint64(modp5) + 1 # maximum number of resgroups for num echo("prime candidates = ", maxpcs, "; resgroups = ", Kmax) let sqrtN = int(sqrt float64(num)) sozP7(sqrtN) # get pcnt and primes <= sqrt(nun) echo("create next[", rescnt, "x", pcnt, "] array") next = newSeq[uint64](rescnt*pcnt) # create size of the global next[] table next_init() # load with first prime multiples resgroups primecnt = 3 # 2,3,5 the P5 excluded primes count var Kn: int = KB # set sieve resgroups size to segment size var Ki = 0'u64 # starting resgroup index for first segment echo("perform segmented SoZ") while Ki < Kmax: # for KB size resgroup slices up to Kmax if Kmax-Ki < uint64(KB): Kn = int(Kmax-Ki) # to set last segment size segsieve(Kn) # sieve primes for current segment #printprms(Kn,Ki) # print primes for the segment (optional) Ki += uint64(KB) # first resgroup index for next seg slice var lprime = 0'u64 # get last prime and primecnt <= val modk = uint64(modp5) * (Kmax-1) # mod for last resgroup in last segment var b = Kn-1 # number of bytes|resgroup in last segment r = rescnt-1 # set to last restrack in resgroup while true: # repeat until last prime determined if (int(seg[b]) and (1 shl r)) == 0: lprime = modk + uint64(residues[r+1]) # numerate the prime value if lprime <= num: break # if <= num exit from loop with lprime primecnt -= 1 # else reduce primecnt if prime > num # reduce restrack, setup next resgroup r -= 1; if r < 0: (r = rescnt-1; modk -= modp5; b -= 1) # if necessary echo("last segment = ", Kn, " resgroups; segments iterations = ", Ki div uint64(KB)) echo("total primes = ", primecnt, "; last prime = ", lprime) ssozp5()