[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-06-16 Thread JIRA
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12505448 ] Doğacan Güney commented on NUTCH-443: - Chris, did you get a chance to look at this? If you are busy, I can assign

[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-06-16 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12505501 ] Chris A. Mattmann commented on NUTCH-443: - Doğacan, Whoops :) This one kind of fell off the radar screen.

[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-05-14 Thread JIRA
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12495696 ] Doğacan Güney commented on NUTCH-443: - I am not sure I follow you Andrzej. My patch already does a very similar

[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-05-14 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12495797 ] Andrzej Bialecki commented on NUTCH-443: - Indeed... I forgot that we need crawl_parse to collect new

[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-05-13 Thread JIRA
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12495357 ] Doğacan Güney commented on NUTCH-443: - Well... That's embarrassing. It seems I forgot to include the necessary

[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-02-28 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12476600 ] Andrzej Bialecki commented on NUTCH-443: - Almost there ... ParseResult seemed to tidy up this patch quite a

[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-02-28 Thread JIRA
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12476611 ] Doğacan Güney commented on NUTCH-443: - * you create the fake CrawlDatum-s in ParseOutputFormat, and then set

[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-02-27 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12476297 ] Andrzej Bialecki commented on NUTCH-443: - Overall the idea of this improvement looks very useful, but I'm -1

[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-02-27 Thread nutch.newbie (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12476361 ] nutch.newbie commented on NUTCH-443: Hi: We were really counting on this patch that it will make it to trunk as

[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-02-15 Thread JIRA
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12473383 ] Doğacan Güney commented on NUTCH-443: - Regarding the ObjectWritable: since in this case all data is composed of

[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-02-14 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12473114 ] Andrzej Bialecki commented on NUTCH-443: - The contract for ParseUtil.getFirstParseEntry() seems unclear -

[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-02-14 Thread JIRA
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12473129 ] Doğacan Güney commented on NUTCH-443: - Andrzej: Thanks for taking the time to review this. The contract for

[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-02-14 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12473141 ] Andrzej Bialecki commented on NUTCH-443: - Didn't know this, will change this too. (Why is Nutch not using

[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-02-14 Thread JIRA
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12473184 ] Doğacan Güney commented on NUTCH-443: - Andrzej: Why does fetcher need to synchronize? Why does the order fetcher

[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-02-13 Thread nutch.newbie (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12472669 ] nutch.newbie commented on NUTCH-443: Chris: I been testing NUTCH-444 and NUTCH-443 lately. Renaud and Dogacan

[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-02-13 Thread Renaud Richardet (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12472733 ] Renaud Richardet commented on NUTCH-443: hi All, Glad to see that this patch is moving forward :-) I have

[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-02-13 Thread Doug Cutting (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12472821 ] Doug Cutting commented on NUTCH-443: this patch in some places removes the log guards Most of the log guards

[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-02-10 Thread nutch.newbie (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12471991 ] nutch.newbie commented on NUTCH-443: Dogacan: It works rather ok, But hen I changed the parse-plugins.xml a bit

[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-02-10 Thread nutch.newbie (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12471998 ] nutch.newbie commented on NUTCH-443: Hi.. After swaping the parse-plugin.xml i.e. the following way .. (and

[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-02-09 Thread JIRA
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12471620 ] Dogacan Güney commented on NUTCH-443: - This is pretty much the merge of our work(except parse-rss, it kept

[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-02-09 Thread nutch.newbie (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12471703 ] nutch.newbie commented on NUTCH-443: I tried the patch with about 100 rss feed. Some problems 1. atom+xml

[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-02-09 Thread nutch.newbie (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12471743 ] nutch.newbie commented on NUTCH-443: After doing some quick research seems like feedparser dont do atom 1.0. The

[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-02-09 Thread Gal Nitzan (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12471747 ] Gal Nitzan commented on NUTCH-443: -- Actually, I have tested Rome after feedparser failed with OutOfMemoy. Rome has

[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-02-09 Thread nutch.newbie (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12471754 ] nutch.newbie commented on NUTCH-443: Gal: Thanks for the feedback and the test you have done. If Nutch is going

[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-02-09 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12471780 ] Chris A. Mattmann commented on NUTCH-443: - Nutch Newbie, What exactly do you mean when you mention Apache

[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-02-09 Thread nutch.newbie (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12471806 ] nutch.newbie commented on NUTCH-443: Chris: Frankly my comments are regarding feedparser and I must say I am

[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-02-09 Thread JIRA
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12471857 ] Dogacan Güney commented on NUTCH-443: - nutch.newbie: I fail to see what the problem is. If feedparser doesn't

[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-02-09 Thread Renaud Richardet (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12471878 ] Renaud Richardet commented on NUTCH-443: Nutch Newbie, Gal, Chris It's great that you discuss alternative

[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser

2007-02-08 Thread JIRA
[ https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12471260 ] Dogacan Güney commented on NUTCH-443: - Ok, this is the second attempt(sorry that I am sending patches in a