I've been interested in using Nutch in a corporate environment where most content requires authentication. I've begun implementing the changes required to include an HttpAuthentication set of interfaces and classes in order to support this (my initial plan is to key off realms). However, I have found an issue in the implementation of Content.java (and subclasses) which may not make this process as clean as possible. The metadata information is stored via Properties which implements Map only allows a single value for a given key. Authentication allows for multiple WWW-Authenticate headers so that the client can create a new request and choose any of the given challenges as the method to authenticate.
I have reviewed the HTTP protocol (RFC 1945) and it does allow for multiple headers using the same name - which makes me think that there may be other headers (now or in the future) that would require multiple values. I have created a class called MultipleProperties which will handle this however it breaks the contract of the Map interface. Is anyone else interested in this type of use of nutch? Should the implementation be left as is even though there may be headers that are currently being missed? My initial code has successfully used the MultiProperties class to collect multiple key/value pairs with the same key. I have also authenticated using Basic authentication at this point and plan to continue developing various authentication schemes. Before I submit anything I'll wait to hear responses. Matt ------------------------------------------------------- This SF.Net email sponsored by Black Hat Briefings & Training. Attend Black Hat Briefings & Training, Las Vegas July 24-29 - digital self defense, top technical experts, no vendor pitches, unmatched networking opportunities. Visit www.blackhat.com _______________________________________________ Nutch-developers mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/nutch-developers
