The best form of self defence in a bar fight is to have left the bar 10 minutes before the fight starts. I learned that in Tai Chi and I think software patent law is a lot like a bar fight. ;-)
cheers, bob On Sep 18, 2014, at 8:11 AM, Raul Miller <[email protected]> wrote: > It's unlikely that most of those patents have merit. > > First, there's prior art (the LZ77 patent, for example - which is now > expired). And second, patents do not cover variations which would have > been obvious to someone with ordinary skill in the art before the > filing date of the patent. > > In the context of a decompression routine written in J, anything > described by the LZ77 patent must be obvious in the context of patents > filed later. And, since J is a variant on APL, any approaches that > were obvious to APL programmers back in the day would also be > considered obvious (for example), but also anything published in prior > centuries could also be considered evidence for obviousness. > > Changing the width of a window from 3 to 4? Obvious. Many grade school > kids are capable of doing that. Using an array instead of a linked > list? Obvious - that's what APL programmers do. Using an APL primitive > rather than some elaborate system of code? Obvious. > > Anyways, if someone tries to sue us on patent grounds we have a lot of > options for how to respond. We could, for example, start a kickstarter > project to document sufficient prior art to refute the patent. > Probably the first step, though, would be to read the claims of the > asserted patents and see what we can find for prior art for any of the > relevant claims. > > Generally speaking the claims will either be so narrow that they do > not apply, or so broad that they do apply but also apply to prior art. > > And, generally speaking, the claims will be asserted by lawyers with > no background in coding and no understanding of what the claims mean. > They'll typically have "experts" who back them up, who are far less > competent - at least in the context of what's obvious to a J > programmer - than the typical J programmer. > > Honestly, the biggest risk is probably falling asleep while reading > the legal "threats". > > Thanks, > > -- > Raul > > > > On Thu, Sep 18, 2014 at 6:09 AM, bill lam <[email protected]> wrote: >> Thanks. This is not urgent because zlib.so should usually >> available. >> >> Also as stated in rfc, many variation of LZ77 were patented so >> it suggested implementors to follow the guidelines (3 bytes at a >> time, hash etc) which is patent free. If the improvement in >> speed is signaficant then we may ignore its guideline. I guess >> there would not be any revenue in sueing Jsoftware for damage. >> >> Чт, 18 сен 2014, Raul Miller написал(а): >>> I can try taking a look at the code, one of these days, if you want me to. >>> >>> (Not right this second though.) >>> >>> Thanks, >>> >>> -- >>> Raul >>> >>> >>> On Thu, Sep 18, 2014 at 5:45 AM, bill lam <[email protected]> wrote: >>>> Thanks Raul, >>>> >>>> I followed you suggestion to use 2 generations of hash vector, >>>> Its speed is faster now although still far behind zlib.so. >>>> >>>> Ср, 17 сен 2014, Raul Miller написал(а): >>>>> I would ignore the stuff about "hashes" and "hash chains" and "singly >>>>> linked". Hashes are nice sometimes but other times they're just a >>>>> disaster. Sometimes that's better than the alternative, but their >>>>> marketing is sometimes significantly better than their actual >>>>> behavior. >>>>> >>>>> Anyways, what's really happening here is an exact match search and >>>>> some kind of constraint on the size of the table. >>>>> >>>>> I think the best way to approach that in J is to search in sorted >>>>> lists. The problem here is the incremental update. You don't want to >>>>> re-sort a large list every time you add something to the list. >>>>> >>>>> The way to approach that, I think, is to have several generations of >>>>> lists, with each larger than the last, and each slower to be update >>>>> than the last. >>>>> >>>>> So, for example, have on list which is unsorted. Use i. to look things >>>>> up in it, and merge it into the next list when it gets too long (maybe >>>>> when it has 30 items in it). >>>>> >>>>> The next list gets sorted every time it gets stuff added to it. Use I. >>>>> to look things up in it. Merge it into the next list when it gets too >>>>> long (maybe when it has 1000 items in it). >>>>> >>>>> Same for the next list, but this one, when it gets too long, you >>>>> arbitrarily discard items from it. >>>>> >>>>> Now the other thing that's happening, here, is that J doesn't really >>>>> do all that well with incremental update algorithms. You'll still be >>>>> slow if you do things "one group of three" at a time. But - >>>>> hypothetically at least - you don't have to. The ideal would be to >>>>> come up with an algorithm for a block of text which complies (at least >>>>> in terms of data structures) with the rfc and which behaves reasonably >>>>> close to the behavior of the rfc described behavior. >>>>> >>>>> But having a slow implementation that works is a really important >>>>> step. That (with some test cases - like the one you described here) >>>>> gives you a baseline to test against. >>>>> >>>>> Hopefully this helps? >>>>> >>>>> Thanks, >>>>> >>>>> -- >>>>> Raul >>>>> >>>>> >>>>> >>>>> >>>>> On Wed, Sep 17, 2014 at 5:23 AM, bill lam <[email protected]> wrote: >>>>>> The LZ77 variation used in deflate is at the bottom (copied from rfc >>>>>> 1951). I tried to translate it into J using a hash function which >>>>>> interprets XYZ as the lower 3 bytes of a 4 byte integer. My J >>>>>> implementation worked but efficient is rather poor. I think the >>>>>> inefficiency of my implementation came from the need to >>>>>> 1. build a vector of pre-computed hash of size equal to sliding >>>>>> window. (although the hash of all bytes had been pre-computed) >>>>>> 2. find the last index of the hash of current XYZ in the vector. >>>>>> (since the left argument changes, special code of the form m&i: does not >>>>>> apply) >>>>>> >>>>>> If data are truly random, the chance of matching would be very small, >>>>>> thus the time spent should roughly the sum of the above 1 & 2 for >>>>>> every byte, however it took a very long time even for data size of >>>>>> only about 500KB. Any idea for a better J implementation. >>>>>> >>>>>> NB. for deflate, minimum length for match is 3, max length is 285, >>>>>> maximum distance is 32K. >>>>>> >>>>>> eg. LZ77 compression of the input 500#'abc' >>>>>> is >>>>>> aa(285,1)(213,1)b(285,1)(214,1)c(285,1)(214,1) >>>>>> >>>>>> inside each bracket is (length,distance) pair applied to the history >>>>>> buffer. Untested but should be something like this. >>>>>> >>>>>> (text from rfc) >>>>>> The compressor uses a chained hash table to find duplicated strings, >>>>>> using a hash function that operates on 3-byte sequences. At any >>>>>> given point during compression, let XYZ be the next 3 input bytes to >>>>>> be examined (not necessarily all different, of course). First, the >>>>>> compressor examines the hash chain for XYZ. If the chain is empty, >>>>>> the compressor simply writes out X as a literal byte and advances one >>>>>> byte in the input. If the hash chain is not empty, indicating that >>>>>> the sequence XYZ (or, if we are unlucky, some other 3 bytes with the >>>>>> same hash function value) has occurred recently, the compressor >>>>>> compares all strings on the XYZ hash chain with the actual input data >>>>>> sequence starting at the current point, and selects the longest >>>>>> match. >>>>>> >>>>>> The compressor searches the hash chains starting with the most recent >>>>>> strings, to favor small distances and thus take advantage of the >>>>>> Huffman encoding. The hash chains are singly linked. There are no >>>>>> deletions from the hash chains; the algorithm simply discards matches >>>>>> that are too old. To avoid a worst-case situation, very long hash >>>>>> chains are arbitrarily truncated at a certain length, determined by a >>>>>> run-time parameter. >>>>>> ---------------------------------------------------------------------- >>>>>> For information about J forums see http://www.jsoftware.com/forums.htm >>>>> ---------------------------------------------------------------------- >>>>> For information about J forums see http://www.jsoftware.com/forums.htm >>>> >>>> -- >>>> regards, >>>> ==================================================== >>>> GPG key 1024D/4434BAB3 2008-08-24 >>>> gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3 >>>> gpg --keyserver subkeys.pgp.net --armor --export 4434BAB3 >>>> ---------------------------------------------------------------------- >>>> For information about J forums see http://www.jsoftware.com/forums.htm >>> ---------------------------------------------------------------------- >>> For information about J forums see http://www.jsoftware.com/forums.htm >> >> -- >> regards, >> ==================================================== >> GPG key 1024D/4434BAB3 2008-08-24 >> gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3 >> gpg --keyserver subkeys.pgp.net --armor --export 4434BAB3 >> ---------------------------------------------------------------------- >> For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
