On Mon, Jan 5, 2009 at 7:00 AM, Vlad Cananau <vlad...@gmail.com> wrote: > Hello > I'm trying to make RSSParser do something simmilar to FeedParser (which > doesn't work quite right) - that is, instead of indexing the whole contents
Why doesn't FeedParser work? Let's fix whatever is broken in it :D > of the feed, I want it to show individual items, with their respective title > and and proper link to the article I realize that I could index 1 depth > more, but I'd like to index just the feed, not the articles that go with it > (keep the index small and the crawl fast). > > For each item in each RSS channel (the code does not differ much for > getParse() of RSSParser.java) I do something like > > Outlink[] outlinks = new Outlink[1]; > try{ > outlinks[0] = new Outlink(whichLink, theRSSItem.getTitle()); > } catch (Exception e) { > continue; > } > > parseResult.put( > whichLink, > new ParseText(theRSSItem.getTitle() + theRSSItem.getDescription()), > new ParseData( > ParseStatus.STATUS_SUCCESS, > theRSSItem.getTitle(), > outlinks, > new Metadata() //was content.getMetadata() > ) > ); > > The problem is, however, that only one item from the whole RSS gets into the > index, although in the log I can see them all ( I've tried it with feeds > from cnn and reuters). What happens? Why do they get overwritten in a > seemingly random order? The item that makes it into the index is neither the > first nor the last, but appears to be the same until new items appear in the > feed. > > Thank you, > Vlad > > -- Doğacan Güney