Could you please create a JIRA issue and attach this patch there so it
won't get lost. It also helps to keep uptodate the CHANGES file as you
can just copy-paste from there when you do a commit.
--
Sami Siren
Brian Whitman wrote:
> The parse-mp3 plugin seems to be saving a state of the previous parse's
> text content. For every new mp3 file parsed, it is putting the contents
> of all the previous text fields in the plain text field for that file.
>
> You can see this by fetching a set of mp3s in one segment, then viewing
> their plain text in the nutch webapp. The plaintext will include the
> contents of all files fetched in that round, which makes searching
> fruitless.
>
> I made a tiny band-aid change to MP3Parser.java and
> MetadataCollector.java against the nightly. It seems to fix the problem.
>
>
> --- MP3Parser.java 2006-12-10 09:43:26.000000000 -0500
> +++ MP3Parser.java.new 2006-12-10 16:37:03.000000000 -0500
> @@ -67,7 +67,7 @@
> fos.write(raw);
> fos.close();
> MP3File mp3 = new MP3File(tmp);
> -
> + metadataCollector.clearText();
> if (mp3.hasID3v2Tag()) {
> parse = getID3v2Parse(mp3, content.getMetadata());
> } else if (mp3.hasID3v1Tag()) {
>
> --- MetadataCollector.java 2006-12-10 09:43:26.000000000 -0500
> +++ MetadataCollector.java.new 2006-12-10 16:37:28.000000000 -0500
> @@ -42,6 +42,10 @@
> this.conf = conf;
> }
>
> + public void clearText() {
> + text = "";
> + }
> +
> public void notifyProperty(String name, String value) throws
> MalformedURLException {
> if (name.equals("TIT2-Text"))
> setTitle(value);
>
>
>
>
>
>
>
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers