On Mon, Jan 5, 2009 at 7:00 AM, Vlad Cananau <vlad...@gmail.com> wrote:
> Hello
> I'm trying to make RSSParser do something simmilar to FeedParser (which
> doesn't work quite right) - that is, instead of indexing the whole contents

Why doesn't FeedParser work? Let's fix whatever is broken in it :D

> of the feed, I want it to show individual items, with their respective title
> and and proper link to the article I realize that I could index 1 depth
> more, but I'd like to index just the feed, not the articles that go with it
> (keep the index small and the crawl fast).
>
> For each item in each RSS channel (the code does not differ much for
> getParse() of RSSParser.java) I do something like
>
>  Outlink[] outlinks = new Outlink[1];
>  try{
>   outlinks[0] = new Outlink(whichLink, theRSSItem.getTitle());
>  } catch (Exception e) {
>   continue;
>  }
>
>  parseResult.put(
>   whichLink,
>   new ParseText(theRSSItem.getTitle() + theRSSItem.getDescription()),
>   new ParseData(
>     ParseStatus.STATUS_SUCCESS,
>     theRSSItem.getTitle(),
>     outlinks,
>     new Metadata() //was content.getMetadata()
>   )
>  );
>
> The problem is, however, that only one item from the whole RSS gets into the
> index, although in the log I can see them all ( I've tried it with feeds
> from cnn and reuters). What happens? Why do they get overwritten in a
> seemingly random order? The item that makes it into the index is neither the
> first nor the last, but appears to be the same until new items appear in the
> feed.
>
> Thank you,
> Vlad
>
>



-- 
Doğacan Güney

Reply via email to