[ http://issues.apache.org/jira/browse/NUTCH-169?page=all ]
Marko Bauhardt updated NUTCH-169:
---------------------------------
Attachment: NutchConf.370854.patch
> * a general comment: plugins now implement NutchConfigurable, which means
> that you had to add two new methods,
> looking exactly the same, to many classes. That's why the NutchConfigured
> class was created.
> I suggest replacing "implements NutchConfigurable" with "extends
> NutchConfigured" where appropriate.
In any case i have to implement two methods in the one hand I have to implement
set and getConf in the other hand I have to overwrite a constructor and a
setConf.
Since in many cases a constructor wouldn't helpful i decide to use the
interface.
In general it may would be make sense to have interface or an abstract class
that has just a configure method nothing more.
> * 1094: I think a better place to set the current config on a protocol
> instance is inside the ProtocolFactory.getProtocol()
> because now the factory itself is instantiated with an instance of nutchConf,
> so it keeps a reference to that config.
Done.
> * 1256: what is this constructor for? I think only the public constructor is
> used.
My mistake, fixed
> * 1311: please replace getExtentens() with getExtensions()
> * 1346, 1375: these classes should be static, I think
> * 1542, 1570: should be static
Done.
> * 1903,8476,10136: I wonder, shouldn't we cache these in nutchConf?
1903: Done.
8476,10136: we cache PluginRepository and not the extensions itself. From my
point of view in general to move caching or object recycling to the tools that
use the extensions / objects but not cache the object in it self.
> * 3154, 3650: what's the point of this line? it was already determined that
> there is nothing useful there... this line exists also in other similar
> facades.
In case the cache is empty I fill the cache inside the if condition in line
3150. So get the freshly cached values to assign them to the field.
> * 3627: typo, should be indexingFilters.
Done.
> * 3718, 6514: I think it would be better to create filters once, and keep
> them around.
> * 5020: is this an intentional change??
> * 6467: I think this change is an error.
> * 6651: I don't understand this comment...
> * 6777: should be static
> * 7045: shouldn't we store these filters too, like all other filters, in
> nutchConf?
> * 7132: I think we can cache CLIENT in nutchConf too.
Fixed.
> * 7782: either we should remove this, or use caching in nutchConf.
Done, I removed this.
> * 10638: local variable overshadows a superclass variable.
Done.
> * 1337 and following, inside CommonGrams.java: spurious whitespace, bad
> formatting
> * 1510-1539, 1748-1772, 1796, 1896-1907, 2556-2582, 2880, 3124-3160, 3207,
> 3211-3217, 4295,
> 4405,4657,4848,5343,5493,6566,6806-6822,6872,7295,7404,7441,7503,7540,7644,7680,7720,
>
> 7859,7896,7964,7977,8011,8049,8214,8226,8244,8280,8456,8471,9045,9162,9227,9261,9323,9342,
>
> 9380,9403,9580,9627,9677,9702,9779,9816,9820,9863,9871,9944,9961,10045,10130,10394,10415,
>
> 11079,11129: inconsistent indenting, should be 2 spaces. Some missing
> whitespace.
> * 1613, 1629, 1691, 2262, 2515, 2687, 2861, 3244, 3510, 3774, 3929, 4010,
> 4157, 4273,
> 4491,6578,6831,6840,6867,6900,6932,6956,6972,6981,7045,7065,7084,7140,7169,7357,7477,
>
> 7882,9171,9195,9204,9765,9910: whitespace
> * 1659,
> 2905,7404,7503,7540,7644,7683,7728,7890,7972,8015,8053,8151,8393,8471,8614,8741,8799,9005,
>
> 9049,9221,9342,9403,9589,9627,9702,9779,9820,9871,9961,10130,10410,10672,10765,
>
> 11129: non-javadoc generated comments should be removed
> * 2241, 2498, 2534-2538, 4244,6176,6798,7096,7256-7264: junk
i had done my best, to get all of this fixed.
I fixed also some other problems beside these you mentioned. Anyway, the test
suite, the crawl-process and the search runs local and in the ndfs successfully
for me.
Anyway it is a really big thing so please test it again.
Thanks, Marko
> remove static NutchConf
> -----------------------
>
> Key: NUTCH-169
> URL: http://issues.apache.org/jira/browse/NUTCH-169
> Project: Nutch
> Type: Improvement
> Reporter: Stefan Groschupf
> Priority: Critical
> Fix For: 0.8-dev
> Attachments: NutchConf.367837.patch, NutchConf.370854.patch,
> NutchConf.Fetcher.060111.patch, NutchConf.Http.060111.patch,
> NutchConf.RegexURLFilter.060111.patch, nutchConf.patch
>
> Removing the static NutchConf.get is required for a set of improvements and
> new features.
> + it allows a better integration of nutch in j2ee or other systems.
> + it allows the management of nutch from a web based gui (a kind of nutch
> appliance) which will improve the usability and also increase the user
> acceptance of nutch
> + it allows to change configuration properties until runtime
> + it allows to implement NutchConf as a abstract class or interface to
> provide other configuration value sources than xml files. (community request)
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers