Re: [A51] A5_Ilx: hung without any result

2011-04-06 Thread Frank A. Stevenson
 Hi Frank,
 Great to hear that we can crack A5/1 in near real-time. But I had some
 problems

Does the test chain generator produce any output ? This should be the
first step to verify if the new kernel is working.

Such as:
./a5il_test
Initialized CAL
CAL Runtime version 1.4.736
Running on 2 GPUs
Num threads 1152
Num threads 1152
Switch to single.
Switch to single.
Switch to double.
Switch to double.
Completed 4608 chains
Completed 4608 chains
Completed 6244 chains
Completed 7580 chains
Completed 4608 chains
Completed 7730 chains
Completed 6094 chains
Completed 4608 chains
Completed 7730 chains
Completed 6094 chains
...



___
A51 mailing list
A51@lists.reflextor.com
http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51


Re: [A51] Improved Kraken - Any news/updates/beta?

2011-04-01 Thread Frank A. Stevenson
Over the last months I have been working on an improved low-latency
version of Kraken. This work has been carried out more or less in secret,
and has not been announced until now. The point of this version is to
determine how quickly a well funded attacker can break A5/1. The code is
reasonable well tested, but doesn't have any big performance gains except
if you are using SSD disks, or have a slower CPU.

What I have done is to write a new GPU A5/1 kernel that can both calculate
chains and relieve the CPU from the final search operation. This kernel is
slower than the bitsliced kernel when it comes to bulk processing A5/1
chains, but since each GPU thread only calculates 1 or 2 A5/1 instances,
the latency (time it takes to complete a single chain) is noticeably
lower. On my machine with 2 GPUs I can (re)crack a burst in under 5
seconds (with all look-ups cached in RAM). It is reasonable to assume that
with the same code given enough GPU / SSD storage, it should be possible
to perform the cracking operation in less than 2 seconds.

I have made the code available for evaluation now, but it isn't properly
integrated into the mainline (yet)

http://traxme.net/a5/a5_ilx/

Steps to build:
1) Untar and build
2) Place A5IlStubs.cpp  A5IlStubs.h in the Kraken build folder
3) Delete A5Cpu.so and A5Ati.so from the Kraken directory if present. Copy
A5Il.so this location instead.
4) Insert #include A5IlStubs.h prior to both includes of
../a5_cpu/A5CpuStubs.h
5) Add A5IlStubs.cpp to the build command (build.sh)
7) Rebuild Kraken  run

This procedure replaces the A5Cpu.so library with A5Il.so which contains
the new code. The new GPU code uses the bit population count
instruction, introduced with DX11 to quickly count the number of bits set
in the taps of the 3 LFSRs. This instruction is somtimes jokingly refered
to as the canonical NSA instruction, since said agency had a habit of
procuring computers that had such a CPU instruction. Also there is a neat
assembly level optimization using ulerp4 (The only instruction that has 3
distinct sources) to shave of some cycles. The A5/1 clocking routine is as
follows:

; Clock r100  - output to r110
func 11
ushr r101.xyz,r100.xyz,l12.xyz
and  r102.xyz,r101.xyz,l14.xyz
u4lerp r102.w,r102.x,r102.y,r102.z
ixor r102.w,r102.w,l1.w
ixor r100.w,r102.w,r102.z
and  r101.xyz,r101.xyz,l1.xyz
ixor r102.x,r101.x,r102.w
ixor r102.y,r101.y,r102.w
ixor r102.z,r101.z,r102.w
and  r104,r100,l13
icbits r106,r104
and  r106.xyz,r106.xyz,r102.xyz
ishl r100.xyz,r100.xyz,r102.xyz
ior  r100.xyz,r100.xyz,r106.xyz
and  r106.w,r106.w,l1.w
ishl r110.x,r110.x,l1.x
ior  r110.x,r110.x,r106.w
ret

The library automatically switches between 1 and 2 instances pr thread,
depending on load (A debug message is left in place). A further
improvement may be to have a version with 4 instances pr thread, but that
is partly outside the scope of just making a low-latency version.

Summary:
A well funded adversary can break A5/1 in near realtime (1-2 seconds)
using only publicly available tables and programs. This will be most
useful for intercepting frequency hopping voice calls on large and crowded
cells.

cheers,
Frank
___
A51 mailing list
A51@lists.reflextor.com
http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51


Re: [A51] Problem with table convertion tool

2010-10-26 Thread Frank A. Stevenson
 Hi list,

 I'm trying to write the tables (.dlt format) to a disk. The whole
 process goes fine with Behemoth.py, but when I call kraken afterwards, a
 bunch of them load fine and then for 116.idx I get this:

 kraken: DeltaLookup.cpp:70: DeltaLookup::DeltaLookup(NcqDevice*,
 std::string): Assertion `offset-0x7fff' failed.
 Aborted

 I re-added the tables with Behemoth.py twice, each time I got this error
 on different tables. I believe the only thing that changed is the order
 in which I added the tables with Behemoth.py. Any ideas on what I am
 doing wrong? For info the arch is i386. I'm positive that the computer
 hardware is not at fault (no bad ram or anything)...

 EiN'

This indicates that the data in the dlt file is corrupted. Check the file
size (should be similar to other files), and file checksum against the
torrent SHA-1 or MD5 posted to the list.

Frank

___
A51 mailing list
A51@lists.reflextor.com
http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51


[A51] Behemoth and the torrent files

2010-08-23 Thread Frank A. Stevenson
It was brought to my attention that Behemoth.py generates a ValueError
if you try to run it on the set of files that are distributed by
torrent.

This is caused by the torrent files having been renamed. The immediate
quick fix is to create a set of symbolic links with the just the table
id as names.:

tables/100.dlt - ../a51_table_100.dlt

Behemoth.py will be updated so it can handle this naming variation as
well. (soon) You have been warned :-)

Frank


___
A51 mailing list
A51@lists.reflextor.com
http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51


Re: [A51] A51 Digest, Vol 15, Issue 6

2010-08-11 Thread Frank A. Stevenson
On Wed, 2010-08-11 at 15:48 +0200, moongaboonga moongaboonga wrote:
  
 Where can I find hashes for tables?
 39GB per table is a lot and should be checked after download.
  
 Thank you!
  

MD5 sums have been posted here:

http://lists.lists.reflextor.com/pipermail/a51/2010-June/000675.html

284 is missing from this list, so it would be nice to have a checksum
for the sake of completeness. (I don't have this table, can someone who
does please post it.)

Frank


___
A51 mailing list
A51@lists.reflextor.com
http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51


[A51] Some small updates.

2010-08-11 Thread Frank A. Stevenson

For those who haven't noticed, Kraken code has found a new home in git,
and the latest version can be gotten by:

  git clone git://git.srlabs.de/kraken.git

The latest version can handle multiple requests simultaneously, and
these are identified by a unique job number.

I have written some sample glue code in python that shows how it is
possible to interface with Kraken over the server socket. The relevant
snippets look like this:

Simple function to read from socket:

def ReadLine(s):
r = 
c = s.recv(1)
while c!='\n':
r = r + c
c = s.recv(1)
return r


And the code for processing a batch of known plaintext looks something
like this

#Open server connection
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((127.0.0.1, 9100))

found = False
for crk in cracks:
pos = crk.find(crack)
skip = False
if pos0:
cmd = crk[pos:]
s.send(cmd+\n)
while True:
r = ReadLine(s)
print r
if (len(r)  5) and r[:6]==crack :
break
if (len(r)  5) and r[:6]==Found :
find = r.split( )
key = find[1]
keypos = find[3]
#Another frame will have to be found from this group
framenum1 = ???
framenum2 = ???
burst2 = ???
pipe = os.popen(./find_kc %s %s %s %s %s%
(key,keypos,framenum1,framenum2,burst2))
result = pipe.read();
print result
rv = result.split(\n)
for r in rv:
if r.find(MATCH)0:
print r
key = r.split( )[1:9]
keystring = string.join(key, )
# This is your key
print keystring
found = True
break
if found:
break

#close connection
s.close()


I hope this can prove useful.

Frank

___
A51 mailing list
A51@lists.reflextor.com
http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51


Re: [A51] Tableformat and Kraken

2010-07-29 Thread Frank A. Stevenson
On Thu, 2010-07-29 at 15:10 +0200, Axel Walsleben wrote:
 Can you give me some IDs/advance parameter that have not yet computed ?
 Its better i make some, that are not already there. ;)
 
 thanks

Table IDs 600,610,620...,690 are available.

Frank


___
A51 mailing list
A51@lists.reflextor.com
http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51


Re: [A51] Tableformat and Kraken

2010-07-29 Thread Frank A. Stevenson
On Thu, 2010-07-29 at 17:19 +0200, Axel Walsleben wrote:
 my genTable selftest runs only on 1 GPU.
 
 This is the Output from Dmesg:
 
 [6.441438] fglrx: module license 'Proprietary. (C) 2002 - ATI 
 Technologies, Starnberg, GERMANY' taints kernel.
 [6.441443] Disabling lock debugging due to kernel taint
 [6.501099] [fglrx] Maximum main memory to use for locked dma 
 buffers: 3801 MBytes.
 [6.501153] [fglrx]   vendor: 1002 device: 6898 count: 1
 [6.501156] [fglrx]   vendor: 1002 device: 6898 count: 2
 [6.501159] [fglrx]   vendor: 1002 device: 6898 count: 3
 [6.501521] [fglrx] ioport: bar 4, base 0x9000, size: 0x100
 
 It Seems, that all 3 HD5870 were detected.
 The Sample Program FindNumDevices tolds me that i only have 1 Device.
 Does anybody have an Hint for me ?
 
 thxs

You need to edit GenTable.py - it is clearly marked where you should
edit the number of GPUs to use. (I know I could use all by default)

While you are at it, take some time to study the auto increment feature.
You will have to edit the increment (10) and the cutoff point (690).

When these settings are correctly adjusted, you can leave your machine
running for weeks, and it will produce 1 table after another without any
intervention from your side, just make sure you have enough free disk
space.

As the tables are completed, you run sort2 on the output data, and you
can delete the original files, freeing up space as you go along. (You
can also use TableConvert and compress the tables to free up even more
space.)

cheers,
Frank

___
A51 mailing list
A51@lists.reflextor.com
http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51


[A51] Kraken, now in server mode

2010-07-28 Thread Frank A. Stevenson
Rather than fighting with torrent creation tools, I spent my time
writing a TCP/IP server core and bolted it on to Kraken. This means that
you can specify a port when starting Kraken, and the program will the go
into server mode, and accept incoming connections on the port.

I have set an initial limit of 25 clients but more can be added if
needed. Currently crack is the only supported command, but I am
thinking to add some table selection logic.

The next step is to write a front end that takes bitvectors and
framecounts and feeds those to 1 or more servers. The front end should
also perform the find_kc check on the results. With this setup it should
be easy to set up a distributed Kraken network, reducing cracking times
dramatically, without resorting to expensive SSD media.

Enjoy!

Frank


___
A51 mailing list
A51@lists.reflextor.com
http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51


Re: [A51] Reporting in.. (what's there, what's missing and some ideas?)

2010-07-28 Thread Frank A. Stevenson
On Wed, 2010-07-28 at 19:20 +0200, Fabio Pietrosanti (naif) wrote:

 1) Airprobe dump the phone call traffic
 - We know that it require important improvement for demodulation of
 real signals
 - We have to see which is the best pratical approach to do it, to
 detect the call, to follow it and which procedure must be implemented
 
 2) Kraken crack the call a5/1 Kc key (that's the most important piece)
 
 3) Some piece of sw decrypt the a5/1 encrypted dump generated by
 Airprobe with the Kc cracked by Kraken.
 

There is a intermediate step here which one shouldn't forget. One needs
to find and identify known plaintext, which can be different from
network to network. So for initial decryption one will gave to find a
way to get Kc from ones SIM card, and use that to decrypt and analyze
call setup (on own conversations). This item is probably already made,
but should be on the list. An alternative may be to use a straight dump
from a Nokia phone.

Frank


___
A51 mailing list
A51@lists.reflextor.com
http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51


Re: [A51] NVIDIA Tesla C1060 GPGPU

2010-07-26 Thread Frank A. Stevenson
On Mon, 2010-07-26 at 18:00 +0100, Cal Leeming [Simplicity Media Ltd]
wrote:
 This might be a silly question but.. 
 
 
 If there was a miscalculation due to hardware, would you be able to
 detect such an occurrence, and re-process that calc again?
 
 

One of the big advantages with Tesla is the use of registered memory for
the GPU. This significantly reduces the risk of a bad memory bit
screwing up your calculations, and the hardware can let you know if such
an error exists. ATI doesn't offer this feature on any of their boards,
and that is pretty much the reason for them being absent from the HPC
space, despite having better performance in relation to both price and
power.

f


___
A51 mailing list
A51@lists.reflextor.com
http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51


[A51] Kraken now on GPU

2010-07-24 Thread Frank A. Stevenson
I just finished a major overhaul of the ATI shared library, and Kraken
is now able to take advantage of the speed that is offered by the GPU.
The work is not finished yet, as I want to further reduce latency by
cracking several bursts in parallel, this will better utilize the GPU
pipeline.

At any rate I have it to the point where disk IO is the bottleneck, and
to my dismay I discovered that 2 of my disks were much slower than the
others, and a job will only go as fast as the slowest disk allows for.
It could be the motherboard / controller, or the disks themselves. I
need to investigate.

Kraken will only load the ATI shared library if it is present, so
A5Ati.so will have to be placed manually in the working directory (same
as kraken) and work will automatically offloaded to the GPU. The final
keysearch is still done on the CPU, but this is limited to just 1 round
of the rainbow table, the GPU handles the rest.

As for distribution of the tables, I have gotten several requests for
having disks shipped, payed for with PayPal. I am not sure I really
trust this company to not freeze the accounts if someone complains about
the contents of ext3 preformatted 2TB disks. I should be able to
deliver these at cost for ~250 USD, this includes Norwegian sales tax
(25%) which I haven't found a way to defray. But perhaps someone on the
list could do better than this and set up shop in a country where
overall costs are lower? And ideally, quality packaging materials are
readily available :-)  (We have already suffer 1 unfortunate disk crash)

Frank




___
A51 mailing list
A51@lists.reflextor.com
http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51


[A51] Finding Kc with Kraken (dotting the i's)

2010-07-18 Thread Frank A. Stevenson
I made a very simple command line interface to Kraken, which has only 1
useful command (crack). Once fired up, you can then try to crack
multiple bursts without reloading the tables every time.

If you have some bursts that you want to crack such as:

3811417:
011100101011101011101101011010011101011101011011100100101100010111101000110101100101011100

3811424:
11100011000100011110001101010101110001101000100111010001111011101110000001

The first number is the frame COUNT used for mixing into A5/1 - it can
be derived from the frame number in the following way:

unsigned int fn2count(unsigned int fn) {
unsigned int t1 = fn/1326;
unsigned int t2 = fn % 26;
unsigned int t3 = fn % 51;
return (t111)|(t35)|t2;
}


The second burst can be cracked, and the command to and output from
Kraken looks like this:

Kraken crack
11100011000100011110001101010101110001101000100111010001111011101110000001

Cracking
11100011000100011110001101010101110001101000100111010001111011101110000001
Found a56290409b507d75 @ 37

Kraken 

This means a56290409b507d75 is the key that produces the output at
postion 37 after 100 clockings. These numbers can then be fed into my
latest tool: find_kc. This program will perform the backclocking,
reverses the frame count mix, and the key setup mixing (based on some
earlier programs that I wrote) - finally it can as an option take a
second frame count together with the burst data as input, and use that
to eliminate the wrong candidate Kcs from the backclocking. Example:

fr...@quant:~/gsm/tmto-svn/tinkering/A5Util$ ./find_kc a56290409b507d75
37 3811424 3811417
011100101011101011101101011010011101011101011011100100101100010111101000110101100101011100
 Found potential key (bits: 37)
db18a071e4d1f057 - db18a071e4d1f057
Framecount is 3811424
KC(0): 2e 61 10 5e 80 93 5e 1c  *** MATCHED ***
KC(1): bc 44 48 ed 03 04 02 53  mismatch
KC(2): d4 37 41 cf 3d 04 05 a5  mismatch
KC(3): da 74 09 51 60 07 7b c7  mismatch
KC(4): f3 f7 a8 3b f6 76 e6 5a  mismatch

The correct Kc is here: 2e 61 10 5e 80 93 5e 1c , and will produce both
cipherstreams correctly, as well as all other cipherstreams, and can
consequently be used to decrypt the entire call or SMS. (Byte order may
have to be changed, depending on your other tools)

How many more nails are needed for A5/1s coffin?  :-)

Frank


___
A51 mailing list
A51@lists.reflextor.com
http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51


Re: [A51] Announcing Berlin A5/1 rainbow table set.

2010-06-17 Thread Frank A. Stevenson
On Thu, 2010-06-17 at 11:38 +0200, Sylvain Munaut wrote:
 39-40 tables have been computed, in what we have dubbed the Berlin
A5/1
  rainbow table set.
 
 And how do we make use of them ?

The tables that currently reside in Norway can be used with the
demonstration program found here:

http://reflextor.com/trac/a51/browser/tinkering/A5Util/a5faster.cpp

This line of code is limited to using a single table at a time for the
moment, but new cracking code is being worked on.

The copy of the tables found in Berlin have been transmogrified into a
format that only works with yet-to-be released tools.

The Berlin A5/1 rainbow table set has a ~22% chance of recovering the
key from 114 bits of cipher. More tables will possibly be added in a
second installment.

Frank


___
A51 mailing list
A51@lists.reflextor.com
http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51


Re: [A51] old gsm 900 phone GA628

2010-06-17 Thread Frank A. Stevenson
I would suggest that you check out the OsmocomBB project, unfortunatly
interception of encrypted data us normally not possible on commodity
phone HW. The reason for this is that decryption happens before error
correction, and is therefore handled in the DSP at a level that normally
is inaccessible to SW.

f

On Thu, 2010-06-17 at 15:29 +0200, moongaboonga moongaboonga wrote:
 I have an old Z80 based phone and a lot of data about it,
 including electrical scheme, firmware, tools for disassemble flash
 code, memory maps, etc.
 It is very simple from hardware point of view, but there is a lot
 code to disassemble.
 The good thing is if someone can accomplish that he has all what needs
 to listen traffic,
 gather gsm algorithms and everything what phone must have on the plate
 and what can be used
 in any way, like for this project for example.
 
 But this is too much for one man to accomplish.
 Is there any chance that someone is interested?
 
 Or at least opinion is that a job that is good way to dive in details
 about real gsm,
  for the price of time it consume?
 
 Best regards!  
 ___
 A51 mailing list
 A51@lists.reflextor.com
 http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51

___
A51 mailing list
A51@lists.reflextor.com
http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51


[A51] Announcing Berlin A5/1 rainbow table set.

2010-06-16 Thread Frank A. Stevenson
Great progress has been made in computing ATI tables with 100 extra
clockings, we got some unexpected but very welcome help from university
students, and at most there were 8 GPUs on the job. 2x5850 + 2x5870 +
2x5970 - allowing us to compute almost 2 TB of tables in around 4 weeks
of time. Currently computation has been halted while we evaluate the
coverage, look at better compressions schemes, and focus on more
efficient ways to perform table lookups - a euphemism for cracking
A5/1 i.e. recovering keys from cipherstream. 

39-40 tables have been computed, in what we have dubbed the Berlin A5/1
rainbow table set. The Berlin reference I feel is significant, besides
the fact that the ideas to create these tables originated there, it has
also has been a collection point for assembling the tables. Moreover,
Berlin was also a focal point for tensions during the cold war, tensions
that in fact dictated the need for creating A5/1 in the first place. In
some ways A5/1 was intended as a virtual wall erected by the West
towards the East, to safeguard privacy of communication. Fortunately the
physical wall that separated East  West fell even before A5/1 was
fielded in the first GSM networks in 1991. Still A5/1 continued to serve
as a relatively effective protective measure for cellular
communications. But over time, as computers, FPGAs eventually GPUs grew
faster and faster, the once significant defenses of A5/1 started
crumbling, and eventually they offered little or no protection. In
response to the relentless advance of computing power, key-lengths were
increased, but in short order the available arsenal of remedies where
exhausted. Despite numerous claims that GSM encryption was at at the end
of its useful life, the GSMA kept insisting that the security offered by
A5/1 was adequate. Such denials and counterclaims, are obviously
counterproductive and even dangerous.

I therefore feel privileged to have taken part in this project, where
hackers from both former East  West have worked together on dismantling
the remains of A5/1 - and effectively declared it completely dead and
broken. Our hope is that this will bring about a shift towards proper
security in cellular communications, and not further compromised
solutions like A5/1 was from the outset. 

The tables that constitute the Berlin A5/1 rainbow table set. given by
their IDs are as follows: 

100 108 116 124 132 140 148 156 164 172
180 188 196 204 212 220 230 238 250 260
268 276 284 292 324 332 340 348 356 364
372 380 388 396 404 412 420 428 492 500

(284  492 are optional)

Due to their size, theses tables are not easily copied over the
Internet, so I have decided to resort to physical transfer in making
copies available to research etc. This I will do by announcing some of
my traveling to the list, and if there are interested receiving parties,
I can bring along tables on hard disk(s) for replication. After some
initial seeding, I believe there will be enough interest for these
tables to make them go viral. In addition, anyone who finds themselves
in Oslo, Norway are welcome to request a copy.

The first available location to make a copy will be:
* Bucharest, Romania, June 24th - July 5th 2010

Other arrangements can also be made, such as swapping preloaded disks
for cash (165EUR @cost) at Schiphol airport.

regards,
Frank A. Stevenson


___
A51 mailing list
A51@lists.reflextor.com
http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51


[A51] Are dummy bursts encrypted? (Airprobe)

2010-06-02 Thread Frank A. Stevenson
Another 'quick and easy' question perhaps.

The SDCCH has a fixed bit  burst rate, but because L3 messages are
mostly processed in synchronous manner, there will be idle time when no
messages need be sent. The BST is supposed to work at a constant power
level, therefore these TDMA frames are filled with dummy bursts.

But what happens when ciphering is enabled ? Will the dummy bursts still
be all 0 bits, or will they be encrypted ? So far I have only had access
to L3 logs that shows there is a time gap between individual messages,
but do we know what those gaps contain ? I would imagine that signal
processing on the receiving side will be easiest if the dummy bursts are
also encrypted, otherwise handling error bits in a a received dummy
burst becomes rather complicated.

F


___
A51 mailing list
A51@lists.reflextor.com
http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51


Re: [A51] Are dummy bursts encrypted? (Airprobe)

2010-06-02 Thread Frank A. Stevenson
On Wed, 2010-06-02 at 12:56 +0200, M vd S wrote:

 
 at night you actually see them flying around by the hundreds, in clear 
 text. This pattern, as well as other defined marker patterns, are 
 defined in this spec:
 http://www.3gpp.org/ftp/Specs/html-info/0502.htm
 

This document specifies that only Normal Burst can occur on an SDCCH,
except for the occasional Access Burst during handover. I am not sure
how much weight to give to this, since obviously some form of LAPDm
filling must be present.

The good news, is that page 55 onwards, shows how the frequency hopping
sequence is determined :-)

F



___
A51 mailing list
A51@lists.reflextor.com
http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51


[A51] Some updates

2010-04-10 Thread Frank A. Stevenson
I would like to give a short update on what goes on a little behind the
scene. Eagle eyed participants may have noticed some clues in the SVN
repository and source tree.

Firstly a lot of though and discussions have taken place around MvdS's
post to the list about applying extra clockings in the rainbow table
round function. And a consensus has more or less been reached, giving
merit to the ideas as posted to the list. And I would personally like to
thank MvdS for his theoretical contributions to the project.

It has however taken quite a bit of time to verify these findings, and
tune the table parameters for optimal performance. Karsten has been
toiling over this part, and is probably best suited to explain the
choices and candidates. Sascha and I have computed various test tables
to verify some of the assumptions. Currently I am computing my first
production table - and will then have to do some further verifications
on the results.

At this point I should underline, that the current generation of tables
are still good, and will be used for lookup. But the ATI tables are
slightly different, and will need some special attention at the sorting
and lookup stage. I have 10 completed tables, and it would probably be
best if the other ATI tables be sent my way after sorting to be
integrated in the lookup process.

However chances are that we will do a transition to tables with a new
format. I hope this will be done in an orderly manner, even though it is
hard to give a time line for how and when this will happen. I have maybe
even gone out on a limb here in announcing this somewhat prematurely.
But I think we have a duty to deliver timely information, as well as
managing expectations.

The current proposal entails computing fewer but larger tables, so for
the distributed project the workload will have to be sliced vertically
as well as horizontally, meaning a fair amount of work may have to be
done in that area. So please don't ask when the new client will be
ready.

The good news is, granted that the new table format delivers as
expected, we should be able to achieve the desired total coverage at an
accelerated rate. Even so, I would like to remind anyone eager to start
computing new tables, pushing forward the time of reaching desired
coverage by a few months, means very little if the intercept capability
required for using those tables is still lacking

regards,

  Frank




___
A51 mailing list
A51@lists.reflextor.com
http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51


Re: [A51] Extreme storage solutions

2010-02-22 Thread Frank A. Stevenson
On Mon, 2010-02-22 at 07:11 +0100, sascha wrote:

 The access time for 4 usb sticks is already 0.3ms. Their combined size is
 64GB. Take 32 of these groups and you get down to 0.01ms (if it scales 
 indeed).
 What i was able to test was that it scales up to 16 devices. I got 0.08ms
 access time with this configuration IIRC.
 Even if you need 2 or even 4 distinct USB host adaptors to connect 128
 devices, you still end up faster and cheaper than with SSD.
 Besides that, those PCIe SSDs are still more expensive than 2.5inch SSDs.
 And we certainly do not need hundreds of megabytes per second bandwidth.
 The write performance is also of no concern.
 
 As for the optimization of the USB bus, that comes for free. If you start
 4 CPU threads and manage to make each thread access the data blocks of
 a single device, then the access time improvements scale linearly with the
 number of devices.

Nice! It seems that USB mass storage should have us well covered for 5K
requests a seconds :-) I did some further experiments on a single stick,
and got access times down to 0.4ms when accessing the block device
directly. (Which means we do away with the filesystem altogether.)

I added the test code to svn:
tmto-svn/tinkering/various/speedy.cpp

typically you have to do sudo time ./speedy /dev/sdX and get timing
figures for 1 random reads. (Just under 4 seconds in my test)

F



___
A51 mailing list
A51@lists.reflextor.com
http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51


Re: [A51] lookup tool for tables

2010-02-08 Thread Frank A. Stevenson
On Sun, 2010-02-07 at 23:31 -0800, Sylv1 wrote:
 Hey,
  
 can anyone tell me more about the issues for the lookup stage with the
 new design?
 I would like to develop a lookup tool for the table on my local
 computer and would like to no more about what i should take care
 about. 
  
 Thanks for the help, 
 Cheers
  
 Sylvain
 
 --- On Sat, 2/6/10, sascha sascha.kriss...@web.de wrote:
 
 From: sascha sascha.kriss...@web.de
 Subject: Re: [A51] lookup tool for tables
 To: a51@lists.reflextor.com
 Date: Saturday, February 6, 2010, 11:18 AM
 
 
 the lookup tool is being worked on. with the latest changes in
 table layout
 there have come up some extra issues. and you wouldnt want a
 lookup tool
 that crashes regularly.
 
 
 ___
 A51 mailing list
 A51@lists.reflextor.com
 http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51


Lookup tools have already been written. I have committed an early
version to SVN : http://reflextor.com/trac/a51/browser/tinkering/A5Util
- this code is only for 'educational' use. Lookup on the CPU is much to
slow, so a GPU version will be needed. Please note that this tool uses a
different compressed format for the tables, and may not be suitable for
use with nvidia generated tables. (The unused bits in the starting
points are at different locations)

On the ATI front I have done some work with bypassing Brook+ runtime
layer, and taken direct control of the buffers / stream management by
using CAL.  I was hoping to see some speed increase by avoiding some
data copying,  but that wasn't a real problem. Profiling shows that the
ATI code leaves the GPU idle 3-7% of the time now. So there isn't much
more to gain.

What comes next is to write a better inter process communication system
for table generation. The python wrapper seems to have problems with
handling  8000 messages a second, and processes just dies on me.

After this i will have to write a better job management system, so the
GPUs can handle smaller work loads. Currently the ATI code needs a full
pipeline to operate. Ideally such a system would be able to shift jobs
between GPUs so that when there is little work to do, it will coalesce
these jobs to a single GPU.

At this point I should be ready for writing GPU lookup code. Well this
is my time line for GPU lookup.

F


___
A51 mailing list
A51@lists.reflextor.com
http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51


Re: [A51] 2.5x efficiency increase by removing illegal states

2010-01-10 Thread Frank A. Stevenson
On Sun, 2010-01-10 at 22:41 +0100, sascha wrote:
 On Sun, Jan 10, 2010 at 09:36:51PM +0100, Frank A. Stevenson wrote:
  
  I think it is should be compared with what you get from purely random
  states, it could be that our round function has a propensity for
  generating invalid states, and that would be a major flaw in our
  approach. (which advance value are you using?)
  
  F
  
  
 
 or mail the source code.

Or I could just do it myself, the back stepping source is available on
http://traxme.net/a5/ . M vd S also mailed me an even faster version of
this code. It is kind of a problem that there are lots of bits and
pieces of the kitchen sink laying around in sundry places, but not
comitted to SVN. Perhaps we should make a tinkering folder in the
repository for these things ?




___
A51 mailing list
A51@lists.reflextor.com
http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51


Re: [A51] 2.5x efficiency increase by removing illegal states

2010-01-10 Thread Frank A. Stevenson
On Mon, 2010-01-11 at 05:09 +0100, sascha wrote:

 due to the bitslicing none of those optimizations are applicable.
 I was expecting a smaller penalty also, due to the fact that i could
 avoid DRAM access. 

If there are real increases in efficiency to be had here, we could move
away from the bitslicing approach. Non bitsliced code runs at around 180
chains pr second on a single GPU, whereas bitsliced runs at 500. I did
notice however that non bitsliced code was running at around the same
speed when using tables and some clever multiplication trick to drive
the LFSRs.

However DX11 has introduced the NSA instruction (bit population count)
to the GPU instruction set. Using this to drive the LFSRs would make it
faster than before, and in my implementation at least, processing 2
chains pr thread instead of just a single would give another speed
increase. (Better GPU utilization, by filling slots that are unused due
to data dependencies) A5/1 could be clocked forward with ~32 operands,
making theoretic max speed about the same is the bitsliced speeds are in
practice. So if efficiency can be increased by removing invalid states,
non-bitsliced code may actually take the lead again.

f


___
A51 mailing list
A51@lists.reflextor.com
http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51


Re: [A51] 2.5x efficiency increase by removing illegal states

2010-01-09 Thread Frank A. Stevenson
On Sat, 2010-01-09 at 19:30 +0100, sascha wrote:

 Also note that the great majority of values in a table are never looked up
 but exist only as a link between the state we are interested in and the
 end value that is looked up in the data base. A false positive that does
 not pass the backclocking test is a rare case and does not influence the
 attack time very much. (is this true? how long does it take to to the
 backclocking?). Still we would need 2 times the storage if we use the old
 method.

I have gotten false positives, that can't bee clocked back during
testing of my table lookup code. Because of the very low current success
rate, it is hard to give empirical evidence of the probability of such
false misses, but I think we should be prepared for 50% false hit rate
of this nature. Meaning ~50% of all key states recovered from the tables
have no valid predecessor states at generation 100+.

This would be in line with overall frequency of valid states in the
table, and should not come as no surprise.

f

___
A51 mailing list
A51@lists.reflextor.com
http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51


Re: [A51] 2.5x efficiency increase by removing illegal states

2010-01-09 Thread Frank A. Stevenson
On Sat, 2010-01-09 at 23:39 +0100, sascha wrote:
 Another question is whether it is possible to convert an illegal state
 that produces the correct keystream to a legal state
 that produces the same keystream. When i generated some chains with
 a simple increment function generating the start values i got 92% chain
 merges in a 10M chains table which suggests that those states that
 produce the same keystream are only a few bit flips apart.

I am not sure if a simple bit flip will do it, but perhaps it is
possible by applying  single or few clockings of some direction to some
of the LFSRs, so you get a legal state, but yet produce the same
output.

Unfortunately when I found my false positives, I had thrown away the
original keys, and had no basis to compare the false positive with the
correct state. We should keep this in mind when we start testing the
lookup code at a bigger scale: to keep the correct keys for reference
and helping us understand better what the situation is with the false
positives.

f

___
A51 mailing list
A51@lists.reflextor.com
http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51


[A51] Table generation. Reminder + ATI script update

2010-01-01 Thread Frank A. Stevenson
Please take note that the reflextor.com web site is not up to date with
respect to the recommended table size. Both the fornt page, and the
table parameter script gives a size of 380M - this should be  270M .
Please ensure that tables are generated to the correct size.

Also the ATI gen_table.py script has been updated in svn. The updated
version fetches table advance parameters automatically, and begins
computing a new table automatically upon completion (i.e. size reaches 270M)

cheers,

 Frank

___
A51 mailing list
A51@lists.reflextor.com
http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51


[A51] Using multiple ATI cards

2009-12-31 Thread Frank A. Stevenson
I just supplemented my computer with a second 5850 card, and I am now
computing 2 tables at once. The instruction for doing this is rather
easy.

1) Make a new folder with gen_table.py and A5Brook.so
2) Download new table parameters to a51id.cgi
3) Write export BRT_ADAPTER=1 before running gen_table.py

Thus the table will be calculated on the specified GPU (1)

Currently my computer is outputting 1000 chains / second. What I really
need now is for the python script to automatically fetch new table
parameters when a table has been completed (easy to do really, I may
comitt this soon).

But when I am not doing A51 (after all it is -25 centigrade outside) I
also wrote a Mandelbulb renderer. Sample video here (in HD):

http://www.youtube.com/watch?v=oe2bzLH65_A

Then I had a brainwave: Making a combined Mandelbulb + A5/1 screensaver
for the unwashed masses (Windows) should easily get us all the tables we
need :-)


Happy new year!

Frank 

___
A51 mailing list
A51@lists.reflextor.com
http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51


Re: [A51] CPU implementation (more code)

2009-11-03 Thread Frank A. Stevenson
sascha wrote:
 Regarding CPU implementations:

 I have attached a 64 wide SIMD implementation (single instruction 64
 bits of data)
 that does 8 million a5/1 rounds per second on a single core
 of a 2ghz core2duo. I am not sure whether -O3 uses SSE automagically,
 so i assume not.
 I use 64 64bit wide memory locations (so effectively L1 cache) to
 store a51 registers vertically, and so i can generate a keystream bit
 for 64 independent a5/1 chains with a single set of instructions:

 (r118(regs) ^ r221(regs) ^ r322(regs));

 so r118(regs) points to a 64bit memory location that stores bit 18
 of R1 for 64 A5/1 engines.

 the performance bottleneck is the shifting of R1,R2,R3, since you have
 to touch each register.

   
I like this idea (it would be great to make a practical implementation) 
... without giving to much thought about it, it seems that the shifting 
can be avoided by unrolling the loop, and using the 64 registers as a 
ring-buffer. Hurray: more templates :-)

At the end of 64 iterations, you could then logical or the 15 
lsb-registers together, and inspect the result for 0 bits, 
(distinguising points) - The round functions can be held in another set 
of 64 bits registers/memory holding transposed round functions and the 
relevant round function(s) can be updated across the registers.

Frank


___
A51 mailing list
A51@lists.reflextor.com
http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51


Re: [A51] Fast sorting compact tables

2009-11-03 Thread Frank A. Stevenson
Frank A. Stevenson wrote:
 I wrote a table sorting function, ...

 Tarball is available for inspection at: http://traxme.net/a5/ (native 
 endianness is assumed when reading the tables)

 F
   
I have done more research on efficient table storage. My initial 
suggestion of using 1 million distict files would simply overwhelm a 
ext2 forametted flash device with inodes, so I had to reduce the number 
of files to a more manageble 4096. I have updated my tables sorting code 
accordingly (on the website for now, will add to svn later ) I have also 
added code that does the lookup into the compressed tables (not finished)

The interesting figures from a 8GB Kingston datatraveler with 5MB/sec 
write speed are:

1000 lookups in a smaller table (40million chains) can be performed in 
2.1 seconds from a cold device (newly inserted, with no locally cached 
inodes) - access speed is not an issue :-)

Time to sort a complete table would be limited to the write speed of the 
device (~15 minutes). It is also interesting to note that the suggested 
table size should fit in  4GB in compressed format. So an even 
multiples of tables can easily be stored on devices in commonly 
available sizes.

The big advantage with flash based storage is that it won't bog down 
your machine with a torrent of disk requests when doing th lookup work.

F


___
A51 mailing list
A51@lists.reflextor.com
http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51


[A51] ATI work in progress.

2009-10-21 Thread Frank A. Stevenson
Hi, I am new to this list and project. I have for some weeks now been 
intrigued by the computational powers of the latest ATI cards, and 
couldn't resist buying a card late last week (ATI Radeon HD 5850). As a 
first project, I decided to write the A5/1 ATI implementation for this 
project.

I have got the basics up and running, the round xor functions is 
missing, and some work need to be done on load balancing. I have 
problems with synchronizing CPU / GPU stalls Windows kernel.

Preliminary test runs at 200 chains / second on a single card. (subject 
to verification)


At this time I feel comfortable making a code drop, which can be found at:

http://traxme.net/a5/a5br.zip

This solution can be build with

MS Visual Studio 2008 express.
ATI Brook+ SDK v1.4

A test executable is included, but the workload is processed in 2-3 
second chunks, and slower cards may have difficulties with this. ( 
Adjust the loop counter in a5.br )

I don't feel particularly inclined to do the TMTO project integration, I 
am of the conviction that namespaces more than 2 levels deep will rot 
the human brain :-p

Frank



___
A51 mailing list
A51@lists.reflextor.com
http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51