Riccardo wrote: > The BIP recall some go code for how the parameter has been selected which > I can hardly understand and run
The code you're linking to is for generating test vectors (to allow implementations to check the correctness of their gcs filters. The name of the file is 'gentestvectors.go'. It produces CSV files which contain test vectors of various testnet blocks and at various false positive rates. > it's totally my fault but if possible I would really like more details on > the process, like charts and explanations When we published the BIP draft last year (wow, time flies!), we put up code (as well as an interactive website) showing the process we used to arrive at the current false positive rate. The aim was to minimize the bandwidth required to download each filter plus the expected bandwidth from downloading "large-ish" full segwit blocks. The code simulated a few wallet types (in terms of number of addrs, etc) focusing on a "mid-sized" wallet. One could also model the selection as a Bernoulli process where we attempt to compute the probability that after k queries (let's say you have k addresses) we have k "successes". A success would mean the queries item wasn't found in the filter, while a failure is a filter match (false positive or not). A failure in the process requires fetching the entire block. -- Laolu On Fri, May 18, 2018 at 5:35 AM Riccardo Casatta via bitcoin-dev < bitcoin-dev@lists.linuxfoundation.org> wrote: > Another parameter which heavily affects filter size is the false positive > rate which is empirically set > <https://github.com/bitcoin/bips/blob/master/bip-0158.mediawiki#construction> > to 2^-20 > The BIP recall some go code > <https://github.com/Roasbeef/bips/blob/83b83c78e189be898573e0bfe936dd0c9b99ecb9/gcs_light_client/gentestvectors.go> > for how the parameter has been selected which I can hardly understand and > run, it's totally my fault but if possible I would really like more details > on the process, like charts and explanations (for example, which is the > number of elements to search for which the filter has been optimized for?) > > Instinctively I feel 2^-20 is super low and choosing a lot higher alpha > will shrink the total filter size by gigabytes at the cost of having to > wastefully download just some megabytes of blocks. > > > 2018-05-17 18:36 GMT+02:00 Gregory Maxwell via bitcoin-dev < > bitcoin-dev@lists.linuxfoundation.org>: > >> On Thu, May 17, 2018 at 3:25 PM, Matt Corallo via bitcoin-dev >> <bitcoin-dev@lists.linuxfoundation.org> wrote: >> > I believe (1) could be skipped entirely - there is almost no reason why >> > you'd not be able to filter for, eg, the set of output scripts in a >> > transaction you know about >> >> I think this is convincing for the txids themselves. >> >> What about also making input prevouts filter based on the scriptpubkey >> being _spent_? Layering wise in the processing it's a bit ugly, but >> if you validated the block you have the data needed. >> >> This would eliminate the multiple data type mixing entirely. >> _______________________________________________ >> bitcoin-dev mailing list >> bitcoin-dev@lists.linuxfoundation.org >> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev >> > > > > -- > Riccardo Casatta - @RCasatta <https://twitter.com/RCasatta> > _______________________________________________ > bitcoin-dev mailing list > bitcoin-dev@lists.linuxfoundation.org > https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev >
_______________________________________________ bitcoin-dev mailing list bitcoin-dev@lists.linuxfoundation.org https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev