Re: [Nutch-dev] Plugins initialized all the time!

2007-05-30 Thread Doğacan Güney
Hi, On 5/29/07, Nicolás Lichtmaier [EMAIL PROTECTED] wrote: Which job causes the problem? Perhaps, we can find out what keeps creating a conf object over and over. Also, I have tried what you have suggested (better caching for plugin repository) and it really seems to make a

[Nutch-dev] RE:回复

2007-05-30 Thread 张豪兴
尊敬的公司领导:(经理/财务)您好! 我司每月有一部分增值税电脑发票和普通商品销售税发票(国税、地税).优惠代开 或合作,点数较低,还可以根据所做数量额度的大小来商讨优惠的点数。 本公司郑重承诺所用绝对是真票!更希望能够有机会与贵司合作!验票后付款。诚 信与保密。贵司如有需要欢迎您来电咨询。   联系电话:13590116835 联系人:张豪兴 E- MAIL [EMAIL PROTECTED] 地址:深圳市深南中路国际文化大厦

[Nutch-dev] 你找我有事吗?

2007-05-30 Thread 代办税票
负责人:您好! 我公司是一家正常纳税的A级企业,在全国大、中、小城市均有。在与任何客户、单位的合作程 序都是按照国家法规进行,如有违规愿承担相关责任,本公司因需扩展市场的竞争性,为客户对 营业税收提供方便灵活、优惠应用;能够对贵公司提供优惠缴纳税款.可以帮客户代开代理发票: 一: 普通国税发票 1:商业销售(可以网上查) 2:货物统一销售 3:工业(企业)销售 二:普通地税发票 1:运输(电脑版运输、货运代理、装卸、联运、海运等) 2:其它服务(广告费、住宿费、会议费、咨询费等) 3:建筑安装 加工修理 4:有海关核销单出售,价格优惠.交接方便

Re: [Nutch-dev] Plugins initialized all the time!

2007-05-30 Thread Andrzej Bialecki
Doğacan Güney wrote: My patch is just a draft to see if we can create a better caching mechanism. There are definitely some rough edges there:) One important information: in future versions of Hadoop the method Configuration.setObject() is deprecated and then will be removed, so we have to

Re: [Nutch-dev] running nutch without http proxy

2007-05-30 Thread Marcin Okraszewski
Seems like this is default. You may rather expect some problems is you want to use proxy. The default configuration is without proxy. Cheers, Marcin On 5/29/07, prem kumar [EMAIL PROTECTED] wrote: Is it possible to run nutch without using a http proxy to search the internet? If so, what are

Re: [Nutch-dev] Plugins initialized all the time!

2007-05-30 Thread Doğacan Güney
On 5/30/07, Andrzej Bialecki [EMAIL PROTECTED] wrote: Doğacan Güney wrote: My patch is just a draft to see if we can create a better caching mechanism. There are definitely some rough edges there:) One important information: in future versions of Hadoop the method

[Nutch-dev] Committer

2007-05-30 Thread Chris Mattmann
Hi Folks, I'd just like to throw out my +1 for Doğacan Güney's committer status. I've been impressed by several of his contributions and the guy just keeps them coming and coming. I'm not a member of the Lucene PMC, so I don't have official voting rights, however, I would like to express my

[Nutch-dev] [jira] Commented: (NUTCH-444) Possibly use a different library to parse RSS feed for improved performance and compatibility

2007-05-30 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12500133 ] Chris A. Mattmann commented on NUTCH-444: - Hi Guys, Okay, here is the way that I currently see this issue,

[Nutch-dev] OutOfMemoryError - Why should the while(1) loop stop?

2007-05-30 Thread Manoharam Reddy
Time and again I get this error and as a result the segment remains incomplete. This wastes one iteration of the for() loop in which I am doing generate, fetch and update. Can someone please tell me what are the measures I can take to avoid this error? And isn't it possible to make some code

Re: [Nutch-dev] OutOfMemoryError - Why should the while(1) loop stop?

2007-05-30 Thread Dennis Kubes
You can change the -Xms and -Xmx settings in the mapred.child.java.opts variable in your hadoop-site.xml file to allow more memory for your tasks. Are you trying to parse extremely big pages or files such as PDFs. If you are you can also set maximum size limits for downloaded content using

[Nutch-dev] [jira] Resolved: (NUTCH-61) Adaptive re-fetch interval. Detecting umodified content

2007-05-30 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-61?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki resolved NUTCH-61. Resolution: Fixed Fix Version/s: 1.0.0 Applied with some modifications in rev.

[Nutch-dev] Sicurezza dei dati personali

2007-05-30 Thread Poste Italiane S . p . A
Title: Poste Italiane Caro cliente Poste.it, La preghiamo di