we needed a kind of representative sample of classes recently, so we wrote this spider type thing. It got us 13GB of jars, which is probably a pretty good (maybe too good) variety. We have tested some of our transformations on a small subset of these files by running the transformation, then doing Class.forName("foo",false /*<-important*/,getClass().getClassLoader()) on each class in the transformed jar to see if all the classes still verify. There was enough weird code in there to keep us busy bug fixing for a while, but I really don't think it's exhaustive. (Hint: some of the classes you get won't verify even before you mess with them)

Feel free to do whatever you want to with this code. It uses the google api, which you can find out about at http://www.google.com/apis/. You'll need to replace "GET YOUR OWN KEY" below with your own key...

import java.util.*;
import java.io.*;
import java.net.*;
import com.google.soap.search.*;

public class JarFinder{
   public static void main(String args[]){
      GoogleSearch search = new GoogleSearch();
      search.setKey("GET YOUR OWN KEY");
      search.setQueryString("\"index of\" jar");
      GoogleSearchResultElement[] elements = null;

      PrintStream logout = null;
      try{
         logout = new PrintStream(new FileOutputStream("jarlog.out"));
      }catch(Exception ex){
         System.err.println("Error opening log output file");
         return;
      }

      HashSet gotten = new HashSet();
      int index=0;
      do{
         search.setStartResult(index);
         index+=10;
         try{
            elements = search.doSearch().getResultElements();
         }catch(Exception ex){
            System.err.println("Search failed");
            return;
         }

System.err.println("Got "+elements.length+" results");

         for (int i=0;i<elements.length;i++){
            Vector lines = new Vector(1000);

try{
BufferedReader input =
new BufferedReader
(new InputStreamReader
(new URL
(elements[i].getURL()).openStream()));
String line=null;
while((line=input.readLine())!=null)
lines.add(line);
input.close();
}catch(Exception ex){
System.err.println("Error connecting or reading from "+elements[i].getURL());
}


for (int j=0;j<lines.size();j++){
String line = (String)lines.get(j);
for (java.util.Iterator jarurls=
getAllURLs(line, elements[i].getURL()).iterator();jarurls.hasNext();){


String jar = (String)jarurls.next();
String savename = jar;
if (savename.indexOf('/')!=-1)
savename = savename.substring(savename.lastIndexOf('/')+1);
if (gotten.contains(savename)){
System.err.println("Skipping "+savename+" (already got one!)");
continue;
}
System.err.println("Saving "+jar+" as "+savename);


                  InputStream jarinput = null;
                  FileOutputStream fout = null;

try{
jarinput = new URL(jar).openStream();
fout = new FileOutputStream(savename);
byte[] bytes=new byte[1048576];
int numread=0;
while((numread=jarinput.read(bytes))>0){
fout.write(bytes, 0, numread);
}
jarinput.close();
fout.close();
}catch(Exception ex){
System.err.println("Error preparing jarfile for download: "+jar);
continue;
}


                  gotten.add(savename);
                  logout.println(jar+" saved as "+savename);
               }
            }
         }

}while(elements.length==10);

      try{
         logout.close();
      }catch(Exception ex){}
   }

   private static HashSet getAllURLs(String line, String resulturl){
      HashSet results = new HashSet();

String lowerline=line.toLowerCase();
int start=0;
for (int result=lowerline.indexOf("href", start); result!=-1; result=lowerline.indexOf("href", start)){
start=result+4;
int qstart=line.indexOf("\"", result);
if (qstart==-1)
continue;
int qend=line.indexOf("\"", qstart+1);
if (qend==-1)
continue;


String jarname = line.substring(qstart+1, qend);
if (jarname.toLowerCase().endsWith(".jar")){
if (!(jarname.startsWith("http://";) ||
jarname.startsWith("https://";) ||
jarname.startsWith("ftp://";))){
jarname = resulturl.substring(0, resulturl.lastIndexOf('/')+1)+jarname;
}


            results.add(jarname);
            start=qend+1;
         }
      }
      return results;
   }
}

Bil Lewis wrote:
I find that the instrumentation I do is generally fine,
but subject to odd little quirks that some compilers (or
other instrumentation tools) produce.

I only have the javac on Apple to work with and I *seem*
to have complete coverage of what it produces, but I
notice now and then that other compilers produce different
byte code and that I sometimes mess up on this code
because I've never seen it before.

So my question is: "Do we have a collection of class files
that give wide coverage of all the possible orderings for byte
codes?"

-Bil




--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]


-- "I say to you that the VCR is to the American film producer and the American public as the Boston strangler is to the woman home alone." -Jack Valenti, President, Motion Picture Association of America, Inc., before The House Subcommittee on Courts, Civil Liberties, and The Administration of Justice, August, 1982, http://cryptome.org/hrcw-hear.htm


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to