Michael Spencer wrote: > andrea wrote: > >>>I was thinking to code the huffman algorithm and trying to compress > >>>something with it, but I've got a problem. > >>>How can I represent for example a char with only 3 bits?? > > >>>I had a look to the compression modules but I can't understand them much... > ... > > I understand I can't do it easily in python, but maybe I could define a > > new type in C and use it to do those dirty works, what do you think? > > Why do you need to create 'very small types'? > > You only need actual bit-twiddling when you do the encoding/de-coding right? > If you create an encoding map 'codes' as a dict of strings of '1' and '0', > encoding might look like (untested): > > def encode(stream): > outchar = count = 0 > for char in stream: > for bit in codes[char]: > (outchar << 1) | (bit == "1") > count +=1 > if count ==8: > yield chr(outchar) > outchar = count = 0 > if count: > yield chr(outchar)
I wrote some Huffman compression code a few years ago, with class BitWriter(object): # writes individual bits to an output stream def __init__(self, outputStream): self.__out = outputStream self.__bitCount = 0 # number of unwritten bits self.__currentByte = 0 # buffer for unwritten bits def write(self, bit): self.__currentByte = self.__currentByte << 1 | bit self.__bitCount += 1 if self.__bitCount == BYTE_SIZE: self.__out.write(chr(self.__currentByte)) self.__bitCount = 0 self.__currentByte = 0 def flush(self): while self.__bitCount > 0: self.write(0) class BitReader(object): # reads individual bits from an input stream def __init__(self, inputStream): self.__in = inputStream self.__bits = [] # buffer to hold incoming bits def readBit(self): if len(self.__bits) == 0: # read the next byte b = ord(self.__in.read(1)) # unpack the bits self.__bits = [(b & (1<<i)) != 0 for i in range(BYTE_SIZE-1, -1, -1)] return self.__bits.pop(0) -- http://mail.python.org/mailman/listinfo/python-list