About ParseMetadata

2013-10-22 Thread Talat UYARER
Hi, When I try to port ATLANTBH's filter-xpath pluigns. I saw a parsemetadata object. I think this used from 1.x. I do little search in 2.x I found in HTMLParser.java. it created but it is not set any every. Can you explain this is need us in 2.x or we can clean this code block ? If this is u

Re: About ParseMetadata

2013-10-23 Thread Talat UYARER
PM, Talat UYARER mailto:talat.uya...@agmlab.com>> wrote: ORIGINAL_CHAR_ENCODING yes, in nutch 2.x , it not use parseMeta and contentMeta in Parse Object. one way is to clean this code block and another way is to add parseMeta in Parse Object. and another parser may will use this meta d

Re: About ParseMetadata

2013-10-27 Thread Talat UYARER
, 2013 at 6:24 PM, Talat UYARER mailto:talat.uya...@agmlab.com>> wrote: Hi Feng lu, I am not good at 1.x. Can you give some information when we need parseMeta in 1.x. is it stored in db ? If that will be necessary, I can develop. But I should understand what we nee

Why is createWebStore not generic ?

2013-11-02 Thread Talat UYARER
Hi All, I need create a table in plugins But I dont create that because of this code: if (WebPage.class.equals(persistentClass)) { schema = conf.get("storage.schema.webpage", "webpage"); } else if (Host.class.equals(persistentClass)) { schema = conf.get("storage.schema.host

Re: Nutch Crawl a Specific List Of URLs (150K)

2013-12-28 Thread Talat Uyarer
FO crawl.Injector - Injector: Merging injected > urls into crawl db. > > I don't know how 140K URLs ended up being 872 in the end... > > /usr/bin > > -- > AWS ubuntu instance > Nutch 1.7 > java version "1.6.0_27" > OpenJDK Runtime En

Re: [REQUEST] Integrate Wicket and Nutch for Google Summer of Code 2014

2014-02-06 Thread Talat Uyarer
Hi Folks, May i learn why do we prefer wicket for admin panel ? Why do we evulate other options ? Talat 6 Şub 2014 16:57 tarihinde "Martin Grigorov" yazdı: > Hi Lewis, > > I'm glad you contacted us! > I see the ticket has been opened for few years now. This is a shame! We > should have coordina

Bandwidth Limit

2014-03-12 Thread Talat Uyarer
Hi all, I wonder can we do limit of bandwith usage ? We can control connection size with fetch thread * reduce count. But How do we control download rate ? Thanks -- Talat UYARER Websitesi: http://talat.uyarer.com Twitter: http://twitter.com/talatuyarer Linkedin: http://tr.linkedin.com/pub

Re: [WELCOME] Nutch PMC Welcomes Talat Uyarer to PMC and Committer

2014-04-01 Thread Talat Uyarer
team. > @Talat, > Please feel free to say a bit about yourself and your current interest in > using and developing Nutch. > Congratulations on your new role within the Nutch community. > Best > Lewis > (on behalf of the Nutch PMC) > > -- > Lewis -- Talat

Re: Add Field to crawled content for indexing

2014-04-02 Thread Talat Uyarer
In addtion to Sebastian's mail, 2.x has index-metadata filter if you want to send any field which is in metadata to index, you just write its name on configuration. I recommend you look at index-metadata Talat 2 Nis 2014 23:30 tarihinde "Sebastian Nagel" yazdı: > Hi Yann, > > > In Parse type, w

Re: [ANNOUNCEMENT] Apache Gora 0.4 Release

2014-04-23 Thread Talat Uyarer
I am happy to see this email. I start to work on NUTCH-1714. Thanks Lewis. 23 Nis 2014 15:53 tarihinde "Lewis John Mcgibbney" < lewis.mcgibb...@gmail.com> yazdı: > Good Afternoon Everyone, > > The Apache Gora team are very proud to announce the immediate release of > Gora 0.4 which is a major re

Re: [DISCUSS] Roadmap for 2.3 Release

2014-05-01 Thread Talat Uyarer
elatively soon. >> >> On Tue, Apr 29, 2014 at 1:09 PM, wrote: >> >>> >>> I think we can also add https://issues.apache.org/jira/browse/NUTCH-1674. >>> This issue was waiting the stable release of gora-0.4. >>> >>> And IMHO, we can a

Re: [DISCUSS] Roadmap for 2.3 Release

2014-05-01 Thread Talat Uyarer
you agree with > me as I was suggesting we stick to the ones already listed minus 1741. > > Thanks > > Julien > > > > On 1 May 2014 08:40, Talat Uyarer wrote: > >> I aggree with you Julien. Today Lewis change some issues's fix version >> 2.3 to 2.4.

Re: [DISCUSS] Roadmap for 2.3 Release

2014-05-01 Thread Talat Uyarer
the next release. [If you want your own >>> repository then open an account on GitHub (or somewhere else) and clone the >>> 2.x branch to add the patches of your choice]. >>> >>> Lewis suggested a roadmap for the next release and the changes he made >>

Giraph Integration

2014-05-02 Thread Talat Uyarer
Hi all, A long time ago, we talk with Julien and Lewis about major needs for 2.x on the maillist. As far as I know Giraph use only map slots as works. At the present our architecture of scoring plugins dont permit. IMHO We should create a pluggable RankingJob like as IndexingJob. The Pluggable ar

About RankingJob for Giraph

2014-05-02 Thread Talat Uyarer
Hi all, A long time ago, we talked with Julien and Lewis about major needs for 2.x on the mail list. I know that Giraph uses only map slots as workers. At the present our architecture of scoring plugins don't permit. Giraph and Opic have different work types. IMHO We should create a pluggable Ran

Better Parser Plugin

2014-05-02 Thread Talat Uyarer
similar Google's parser. Wdyt ? [1] http://jsoup.org/ [2] https://github.com/google/gumbo-parser -- Talat UYARER Websitesi: http://talat.uyarer.com Twitter: http://twitter.com/talatuyarer Linkedin: http://tr.linkedin.com/pub/talat-uyarer/10/142/304

Re: About RankingJob for Giraph

2014-05-05 Thread Talat Uyarer
> But, to avoid any misunderstanding: that's not against writing > a "RankingJob for Giraph". > > Sebastian > > > On 05/03/2014 12:10 AM, Talat Uyarer wrote: >> Hi all, >> >> A long time ago, we talked with Julien and Lewis about major nee

Re: Giraph Integration

2014-05-05 Thread Talat Uyarer
t giraph algorithms similar thia solution. If this >> makes sense for everybody, After 2.3 releaes i can implement it. > > Last year there was work done to address this. > You can see it here > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=31820383 > ht

Re: Better Parser Plugin

2014-05-05 Thread Talat Uyarer
nefits? If we have a clear cut argument then lets go for > it. If not then maybe your time would be better invested elsewhere. It's up > to you I suppose :) > -- Talat UYARER Websitesi: http://talat.uyarer.com Twitter: http://twitter.com/talatuyarer Linkedin: http://tr.linkedin.com/pub/talat-uyarer/10/142/304

Re: Better Parser Plugin

2014-05-05 Thread Talat Uyarer
rser. >> > So what are the benefits? If we have a clear cut argument then lets go for > it. If not then maybe your time would be better invested elsewhere. It's up > to you I suppose :) > -- Talat UYARER Websitesi: http://talat.uyarer.com Twitter: http://twitter.com/talatuyarer Linkedin: http://tr.linkedin.com/pub/talat-uyarer/10/142/304

Re: [VOTE] Remove pom.xml from source

2014-07-15 Thread Talat Uyarer
; Open Source Solutions for Text Engineering > > http://digitalpebble.blogspot.com/ > http://www.digitalpebble.com > http://twitter.com/digitalpebble > -- Talat UYARER Websitesi: http://talat.uyarer.com Twitter: http://twitter.com/talatuyarer Linkedin: http://tr.linkedin.com/pub/talat-uyarer/10/142/304

Re: [ANNOUNCE] GSoC Create a Wicket-based Web Application for Nutch Project SUCCESSFUL

2014-08-31 Thread Talat Uyarer
+++'';;';;';;;''';;':,,,:.:,.` > > ```..::,;';:;';';'';';';';'''++###+'+;''';;:;.:..:.., > > ,;;:;:;';''';''++##+++.:..:.,; > ` > > `.``,,:,';;::;;::';'';';;';';;';';;';;';';';'++#+###@#++:...,,.;:. > > `:.';.,;;',,;;;';';;';;':;;;';';;';;';';';;'''.:,:.,:'#@'::, > > ```.:,';;.::':';';',;;;';;':;';;';;'';;';'';;.;.,.:..,:.:: > > ``:::',:;';;,:;;',:';';;':';'';;'::',..,.,.,:+` > > `..:'+:';;',;';,:;:',,';::,';;',,';;.:.:;, > > ``,.';;:':,;:;,,:;:::``..,:,`` > > :`;;` > > ``: ,:` > > > > > > > > http://people.apache.org/~lewismc || @hectorMcSpector || > http://www.linkedin.com/in/lmcgibbney > > Apache Gora V.P || Apache Nutch PMC || Apache Any23 V.P || Apache OODT PMC || > Apache Open Climate Workbench PMC || Apache Tika PMC || Apache TAC -- Talat UYARER Websitesi: http://talat.uyarer.com Twitter: http://twitter.com/talatuyarer Linkedin: http://tr.linkedin.com/pub/talat-uyarer/10/142/304

Re: Jump to 3.X WAS [RELEASE] Apache Nutch 1.9

2014-09-01 Thread Talat Uyarer
t;> >> Based on the discussion from which this new thread stems I would totally >> be behind this. It breathes new life into trunk. Which is a bonnie feather >> in the Nutch bonnet. Here is my +1 on that one. >> >> >>> >>> Nutch2 remains Nutch and ca

Re: Generic xsl parser plugin

2014-09-25 Thread Talat Uyarer
Hi all, I made some changes Emir's plugin for completable with 2.x That is useful If you need I can share my fork. Talat On Sep 26, 2014 6:47 AM, "Nima Falaki" wrote: > Hi: > > Yes, it would be very interesting. Let me know what Emir says > > Nima > > On Thu, Sep 25, 2014 at 12:43 PM, Albinscod

Re: Generic xsl parser plugin

2014-09-25 Thread Talat Uyarer
Last thing I wrote a how to use it document. :) On Sep 26, 2014 6:52 AM, "Talat Uyarer" wrote: > Hi all, > > I made some changes Emir's plugin for completable with 2.x That is useful > If you need I can share my fork. > > Talat > On Sep 26, 2014 6:47 AM, &quo

Re: Moving Away from MoinMoin

2014-09-25 Thread Talat Uyarer
+1 cwiki is definently nice. i can help you for this migration. On Sep 25, 2014 6:30 PM, "Lewis John Mcgibbney" wrote: > Hi Folks, > MoinMoin is driving me literally mad. > I wait for half an hour every time I edit the documentation. > It is primative and no-one can comment on documentation. > W

Re: Problem in trunk in regards to the protocol-http/src/test plugin

2014-10-01 Thread Talat Uyarer
nd solves the problem). > > Does anyone else get this problem? Or is it just me. My build.xml should > solve this problem. Can a committer check this in? > -- > > > > Nima Falaki > Software Engineer > nfal...@popsugar.com > > -- Talat UYARER Websitesi: http://ta

Re: Problem in trunk in regards to the protocol-http/src/test plugin

2014-10-01 Thread Talat Uyarer
Thanks for attention. We sent only to 2.x branch. We should port to trunk. Do you want to do this ? 2014-10-01 15:55 GMT+03:00 Nima Falaki : > I am on trunk. > On Oct 1, 2014 5:46 AM, "Talat Uyarer" wrote: > >> Hi Nima, >> >> Which version of Nutch

Re: Problem in trunk in regards to the protocol-http/src/test plugin

2014-10-01 Thread Talat Uyarer
elf? > > On Wed, Oct 1, 2014 at 5:58 AM, Talat Uyarer wrote: > >> Thanks for attention. We sent only to 2.x branch. We should port to >> trunk. Do you want to do this ? >> >> 2014-10-01 15:55 GMT+03:00 Nima Falaki : >> >>> I am on trunk. >>> On

Re: Build failed in Jenkins: Nutch-nutchgora #1264

2014-12-14 Thread Talat Uyarer
] [SUCCESSFUL ] commons-io#commons-io;2.4!commons-io.jar >> (104ms) >> [ivy:resolve] downloading >> http://repo1.maven.org/maven2/org/slf4j/slf4j-api/1.7.7/slf4j-api-1.7.7.jar >> ... >> [ivy:resolve] .. (28kB) >> [ivy:resolve] .. (0kB) >> [ivy:resolve] [SUCCESSFUL ] org.slf4j#slf4j-api;1.7.7!slf4j-api.jar >> (165ms) >> [ivy:resolve] downloading >> http://repo1.maven.org/maven2/org/mortbay/jetty/jetty-sslengine/6.1.26/jetty-sslengine-6.1.26.jar >> ... >> [ivy:resolve] .. (18kB) >> [ivy:resolve] .. (0kB) >> [ivy:resolve] [SUCCESSFUL ] >> org.mortbay.jetty#jetty-sslengine;6.1.26!jetty-sslengine.jar (87ms) >> [ivy:resolve] downloading >> http://repo1.maven.org/maven2/org/mortbay/jetty/jetty-util5/6.1.26/jetty-util5-6.1.26.jar >> ... >> [ivy:resolve] .. (22kB) >> [ivy:resolve] .. (0kB) >> [ivy:resolve] [SUCCESSFUL ] >> org.mortbay.jetty#jetty-util5;6.1.26!jetty-util5.jar (88ms) >> [ivy:resolve] >> [ivy:resolve] :: problems summary :: >> [ivy:resolve] WARNINGS >> [ivy:resolve] :: >> [ivy:resolve] :: UNRESOLVED DEPENDENCIES :: >> [ivy:resolve] :: >> [ivy:resolve] :: >> org.restlet.jse#org.restlet.lib.org.restlet.lib.org.json;2.0: >> java.text.ParseException: inconsistent module descriptor file found in >> 'http://maven.restlet.org/org/restlet/jse/org.restlet.lib.org.restlet.lib.org.json/2.0/org.restlet.lib.org.restlet.lib.org.json-2.0.pom': >> bad module name: expected='org.restlet.lib.org.restlet.lib.org.json' >> found='org.restlet.lib.org.json'; >> [ivy:resolve] :: >> [ivy:resolve] ERRORS >> [ivy:resolve] restlet: bad module name found in >> http://maven.restlet.org/org/restlet/jse/org.restlet.lib.org.restlet.lib.org.json/2.0/org.restlet.lib.org.restlet.lib.org.json-2.0.pom: >> expected='org.restlet.lib.org.restlet.lib.org.json >> found='org.restlet.lib.org.json' >> [ivy:resolve] >> [ivy:resolve] :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS >> >> BUILD FAILED >> <https://builds.apache.org/job/Nutch-nutchgora/ws/2.x/build.xml>:467: >> impossible to resolve dependencies: >> resolve failed - see output for details >> >> Total time: 1 minute 24 seconds >> Build step 'Invoke Ant' marked build as failure >> Publishing Javadoc >> > -- Talat UYARER Websitesi: http://talat.uyarer.com Twitter: http://twitter.com/talatuyarer Linkedin: http://tr.linkedin.com/pub/talat-uyarer/10/142/304

Re: elasticindex error job failed: name=elastic-index‏

2014-12-27 Thread Talat Uyarer
Hi Husein, Can you share your hadoop.log file On Dec 26, 2014 7:24 PM, "Hesham Hussein" wrote: > Hello, > > I get this error > > $ bin/nutch elasticindex elasticsearch -all > Exception in thread "main" java.lang.RuntimeException: job failed: > name=elastic-index [elasticsearch], jobid=null >

Re: elasticindex error job failed: name=elastic-index

2014-12-29 Thread Talat Uyarer
Hi, Is version of your elasticsearch same with nutch dependency version ? 2014-12-29 19:43 GMT+02:00 Hesham Hussein : > Sure. Here is the file. > > Thanks > > BTW: I'm using Mysql for the DB. -- Talat UYARER Websitesi: http://talat.uyarer.com Twitter: http://twitter.com/

Re: [VOTE] Release Apache Nutch 2.3

2015-01-15 Thread Talat Uyarer
Hi Lewis, I finish my review yet. - AdaptiveFetchSchedular do not work. In default settings float, it needs integer. - it does not compile with 1.6 in my environment If those are not problem for next release publishing. It looks OK Talat On Jan 16, 2015 8:40 AM, "Lewis John Mcgibbney" wrote:

Re: [VOTE] Release Apache Nutch 2.3

2015-01-22 Thread Talat Uyarer
t we only support 1.7 now. This is the case with Nutch > 1.10-SNAPSHOT anyways. > >> >> >> If those are not problem for next release publishing. It looks OK > > > So this is a VOTE thread. Can you please provide a VOTE so we can determine > whether we can release this

Re: GSoC 2015

2015-02-05 Thread Talat Uyarer
e.com > <http://www.digitalpebble.com> > http://twitter.com/digitalpebble <http://twitter.com/digitalpebble> > > -- Talat UYARER Websitesi: http://talat.uyarer.com Twitter: http://twitter.com/talatuyarer Linkedin: http://tr.linkedin.com/pub/talat-uyarer/10/142/304

Re: [ANNOUNCE] New Nutch committer and PMC - Jorge Luis Betancourt Gonzalez

2015-02-19 Thread Talat Uyarer
Welcome on board! On Feb 20, 2015 12:18 AM, "Mattmann, Chris A (3980)" < chris.a.mattm...@jpl.nasa.gov> wrote: > Welcome to the party, Jorge! > > Cheers, > Chris > > ++ > Chris Mattmann, Ph.D. > Chief Architect > Instrument Software a

Google Summer of Code 2015 Mentor Registration

2015-03-11 Thread Talat Uyarer
Nutch PMC, Please acknowledge my request to become a mentor for Google Summer of Code 2015 projects for Apache Nutch. My Melange username is talat. -- Talat UYARER Websitesi: http://talat.uyarer.com Twitter: http://twitter.com/talatuyarer Linkedin: http://tr.linkedin.com/pub/talat-uyarer/10

Re: Google Summer of Code 2015 Mentor Registration

2015-03-11 Thread Talat Uyarer
p://www.google-melange.com/gsoc/org2/google/gsoc2015/apache > [3] http://www.google-melange.com/gsoc/homepage/google/gsoc2015 > [4] https://svn.apache.org/repos/private/committers/GsocLinkId.txt > [5] > http://www.google-melange.com/gsoc/connection/start/user/google/gsoc2015/apache > > > > -- > Lewis -- Talat UYARER Websitesi: http://talat.uyarer.com Twitter: http://twitter.com/talatuyarer Linkedin: http://tr.linkedin.com/pub/talat-uyarer/10/142/304

Re: Unsubscribe

2015-05-04 Thread Talat Uyarer
For unsubscribing, Please send an empty email to dev-unsubscr...@nutch.apache.org Detailed information on http://apache.org/foundation/mailinglists.html On May 5, 2015 5:10 AM, "Prerna Totla" wrote: > Regards, > Prerna Totla. >

Re: Unsubscribe

2015-05-04 Thread Talat Uyarer
For unsubscribing, Please send an empty email to dev-unsubscr...@nutch.apache.org Detailed information on http://apache.org/foundation/mailinglists.html On May 4, 2015 10:00 PM, "Avani Gupta" wrote: > > Thanks and Regards, > > Avani Gupta > Master's Student > Department of Computer Science > Uni

Please read this who want to Unscribing

2015-05-19 Thread Talat Uyarer
For unsubscribing, Please send an empty email to dev-unsubscr...@nutch.apache.org Detailed information on http://apache.org/foundation/mailinglists.html

Re: Nutch-1741 in GSOC 2015

2015-05-25 Thread Talat Uyarer
>>> As you are in the community bonding period right now, please feel free >>>>> to provide your wiki username to me and I will grant you access to the >>>>> wiki. >>>>> Please also feel free to pick up some lingering issues for Nutch 2.3.1 >

Nutch and JS/Css rendering

2015-07-06 Thread Talat Uyarer
anding-web-pages-better.html -- Talat UYARER

Re: Nutch and JS/Css rendering

2015-07-06 Thread Talat Uyarer
e with me, I will >> be glad. >> >> [1] >> http://googlewebmastercentral.blogspot.com.tr/2014/05/understanding-web-pages-better.html >> >> -- >> Talat UYARER >> -- Talat UYARER Websitesi: http://talat.uyarer.com Twitter: http://twitter.com/talatuyarer Linkedin: http://tr.linkedin.com/pub/talat-uyarer/10/142/304

[jira] [Updated] (NUTCH-1126) JUnit test for urlfilter-prefix

2013-06-24 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Talat UYARER updated NUTCH-1126: Attachment: test_case_for_urlfilter-prefix.patch We create a test case patch for this

[jira] [Updated] (NUTCH-1124) JUnit test for scoring-opic

2013-07-05 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Talat UYARER updated NUTCH-1124: Attachment: NUTCH1124.patch We create a test case for Opic scoring > JUnit t

[jira] [Updated] (NUTCH-1124) JUnit test for scoring-opic

2013-07-12 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Talat UYARER updated NUTCH-1124: Attachment: (was: NUTCH1124.patch) > JUnit test for scoring-o

[jira] [Updated] (NUTCH-1124) JUnit test for scoring-opic

2013-07-12 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Talat UYARER updated NUTCH-1124: Attachment: NUTCH-1124.patch Update for indentation. > JUnit test for scor

[jira] [Comment Edited] (NUTCH-1124) JUnit test for scoring-opic

2013-07-12 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13706902#comment-13706902 ] Talat UYARER edited comment on NUTCH-1124 at 7/12/13 12:4

[jira] [Updated] (NUTCH-1124) JUnit test for scoring-opic

2013-07-13 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Talat UYARER updated NUTCH-1124: Attachment: NUTCH1124.patch I generate like my first patch Jewis. My code environment little mix

[jira] [Updated] (NUTCH-1124) JUnit test for scoring-opic

2013-07-13 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Talat UYARER updated NUTCH-1124: Attachment: (was: NUTCH-1124.patch) > JUnit test for scoring-o

[jira] [Updated] (NUTCH-1124) JUnit test for scoring-opic

2013-07-13 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Talat UYARER updated NUTCH-1124: Attachment: NUTCH-1124.patch I update my patch. This time I generate like my first patch Jewis. My

[jira] [Issue Comment Deleted] (NUTCH-1124) JUnit test for scoring-opic

2013-07-13 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Talat UYARER updated NUTCH-1124: Comment: was deleted (was: I generate like my first patch Jewis. My code environment little mix. I

[jira] [Updated] (NUTCH-1124) JUnit test for scoring-opic

2013-07-13 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Talat UYARER updated NUTCH-1124: Attachment: (was: NUTCH1124.patch) > JUnit test for scoring-o

[jira] [Created] (NUTCH-1618) Fetches some websites multiple times for long lasting queues

2013-07-25 Thread Talat UYARER (JIRA)
Talat UYARER created NUTCH-1618: --- Summary: Fetches some websites multiple times for long lasting queues Key: NUTCH-1618 URL: https://issues.apache.org/jira/browse/NUTCH-1618 Project: Nutch

[jira] [Updated] (NUTCH-1618) Fetches some websites multiple times for long lasting queues

2013-07-25 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Talat UYARER updated NUTCH-1618: Description: We are using nutch for high volume crawls. We noticed that FetcherJob ReduceTask

[jira] [Updated] (NUTCH-1618) Fetches some websites multiple times for long lasting queues

2013-07-25 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Talat UYARER updated NUTCH-1618: Attachment: NUTCH-1618.patch > Fetches some websites multiple times for long lasting que

[jira] [Commented] (NUTCH-1413) Fetcher to record response time

2013-07-31 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13724963#comment-13724963 ] Talat UYARER commented on NUTCH-1413: - We write our code on Nutch 2.1 Maybe this

[jira] [Updated] (NUTCH-1413) Fetcher to record response time

2013-07-31 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Talat UYARER updated NUTCH-1413: Attachment: NUTCH-1413.patch Developed by Yasin KILINC and Talat UYARER > Fetc

[jira] [Updated] (NUTCH-1413) Fetcher to record response time

2013-07-31 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Talat UYARER updated NUTCH-1413: Attachment: NUTCH-1413_metadata.patch I change my adding style. Now it use metadata. If you says

[jira] [Commented] (NUTCH-1619) Writes Dmoz Description and Title information to db with snippet argument

2013-08-20 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13744763#comment-13744763 ] Talat UYARER commented on NUTCH-1619: - Hi lufeng, I fixed datastore close pro

[jira] [Updated] (NUTCH-1619) Writes Dmoz Description and Title information to db with snippet argument

2013-08-20 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Talat UYARER updated NUTCH-1619: Attachment: NUTCH-1619.patch Data store close fix > Writes Dmoz Description

[jira] [Created] (NUTCH-1630) How to achieve finishing fetch approximately at the same time for each queue (a.k.a adaptive queue size)

2013-08-21 Thread Talat UYARER (JIRA)
Talat UYARER created NUTCH-1630: --- Summary: How to achieve finishing fetch approximately at the same time for each queue (a.k.a adaptive queue size) Key: NUTCH-1630 URL: https://issues.apache.org/jira/browse/NUTCH

[jira] [Updated] (NUTCH-1630) How to achieve finishing fetch approximately at the same time for each queue (a.k.a adaptive queue size)

2013-08-21 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Talat UYARER updated NUTCH-1630: Issue Type: Improvement (was: Bug) > How to achieve finishing fetch approximately at the s

[jira] [Updated] (NUTCH-1620) log how many URLs are generated and contained within a particular batchId

2013-09-10 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Talat UYARER updated NUTCH-1620: Attachment: NUTCH-1620.patch Hi, I create patch for this. > log how many U

[jira] [Updated] (NUTCH-1630) How to achieve finishing fetch approximately at the same time for each queue (a.k.a adaptive queue size)

2013-09-10 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Talat UYARER updated NUTCH-1630: Attachment: NUTCH-1630.patch > How to achieve finishing fetch approximately at the same t

[jira] [Commented] (NUTCH-1620) log how many URLs are generated and contained within a particular batchId

2013-09-10 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13763422#comment-13763422 ] Talat UYARER commented on NUTCH-1620: - Hi Julien, You can see in patch, I take

[jira] [Commented] (NUTCH-1630) How to achieve finishing fetch approximately at the same time for each queue (a.k.a adaptive queue size)

2013-09-10 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13763429#comment-13763429 ] Talat UYARER commented on NUTCH-1630: - I attach my patch. Now i use this. This p

[jira] [Commented] (NUTCH-1630) How to achieve finishing fetch approximately at the same time for each queue (a.k.a adaptive queue size)

2013-09-10 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13764014#comment-13764014 ] Talat UYARER commented on NUTCH-1630: - Hi Markus, This patch is suitable for t

[jira] [Commented] (NUTCH-1630) How to achieve finishing fetch approximately at the same time for each queue (a.k.a adaptive queue size)

2013-09-11 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13764128#comment-13764128 ] Talat UYARER commented on NUTCH-1630: - Sorry Markus, you are right. I thought t

[jira] [Commented] (NUTCH-1086) Rewrite protocol-httpclient

2013-09-16 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13768755#comment-13768755 ] Talat UYARER commented on NUTCH-1086: - Markus, I guess httpclient is end of

[jira] [Commented] (NUTCH-1086) Rewrite protocol-httpclient

2013-09-17 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769338#comment-13769338 ] Talat UYARER commented on NUTCH-1086: - Hi Markus, Yes I know that Httpclien

[jira] [Updated] (NUTCH-1124) JUnit test for scoring-opic

2013-09-17 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Talat UYARER updated NUTCH-1124: Attachment: NUTCH-1124-v2.patch Comment lines added. > JUnit test for scor

[jira] [Created] (NUTCH-1643) Unnecessary fetching with http.content.limit when using protocol-http

2013-09-18 Thread Talat UYARER (JIRA)
Talat UYARER created NUTCH-1643: --- Summary: Unnecessary fetching with http.content.limit when using protocol-http Key: NUTCH-1643 URL: https://issues.apache.org/jira/browse/NUTCH-1643 Project: Nutch

[jira] [Updated] (NUTCH-1643) Unnecessary fetching with http.content.limit when using protocol-http

2013-09-18 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Talat UYARER updated NUTCH-1643: Attachment: NUTCH-1643.patch > Unnecessary fetching with http.content.limit when us

[jira] [Updated] (NUTCH-1643) Unnecessary fetching with http.content.limit when using protocol-http

2013-09-18 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Talat UYARER updated NUTCH-1643: Patch Info: Patch Available > Unnecessary fetching with http.content.limit when using proto

[jira] [Updated] (NUTCH-1630) How to achieve finishing fetch approximately at the same time for each queue (a.k.a adaptive queue size)

2013-09-18 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Talat UYARER updated NUTCH-1630: Patch Info: Patch Available > How to achieve finishing fetch approximately at the same t

[jira] [Created] (NUTCH-1645) Junit Test Case for Adaptive Fetch Schedule class

2013-09-29 Thread Talat UYARER (JIRA)
Talat UYARER created NUTCH-1645: --- Summary: Junit Test Case for Adaptive Fetch Schedule class Key: NUTCH-1645 URL: https://issues.apache.org/jira/browse/NUTCH-1645 Project: Nutch Issue Type

[jira] [Commented] (NUTCH-1647) protocol-http throws unzipBestEffort returned null for some pages

2013-10-01 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13782880#comment-13782880 ] Talat UYARER commented on NUTCH-1647: - Hi Markus, This issue effected 2.2.1 Ma

[jira] [Created] (NUTCH-1650) Adaptive Fetch Scheduler interval Wrong Set

2013-10-04 Thread Talat UYARER (JIRA)
Talat UYARER created NUTCH-1650: --- Summary: Adaptive Fetch Scheduler interval Wrong Set Key: NUTCH-1650 URL: https://issues.apache.org/jira/browse/NUTCH-1650 Project: Nutch Issue Type: Bug

[jira] [Updated] (NUTCH-1650) Adaptive Fetch Scheduler interval Wrong Set

2013-10-04 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Talat UYARER updated NUTCH-1650: Attachment: NUTCH-1650.patch Adaptive Schedular Setting Interval Patch > Adaptive Fetch Schedu

[jira] [Created] (NUTCH-1651) modifiedTime and prevmodifiedTime never set

2013-10-04 Thread Talat UYARER (JIRA)
Talat UYARER created NUTCH-1651: --- Summary: modifiedTime and prevmodifiedTime never set Key: NUTCH-1651 URL: https://issues.apache.org/jira/browse/NUTCH-1651 Project: Nutch Issue Type: Bug

[jira] [Updated] (NUTCH-1651) modifiedTime and prevmodifiedTime never set

2013-10-04 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Talat UYARER updated NUTCH-1651: Attachment: NUTCH-1651.patch > modifiedTime and prevmodifiedTime never

[jira] [Commented] (NUTCH-1650) Adaptive Fetch Scheduler interval Wrong Set

2013-10-04 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13786014#comment-13786014 ] Talat UYARER commented on NUTCH-1650: - Markus, I dont see any problem in 1.x

[jira] [Updated] (NUTCH-1588) Port NUTCH-1245 URL gone with 404 after db.fetch.interval.max stays db_unfetched in CrawlDb and is generated over and over again to 2.x

2013-10-05 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Talat UYARER updated NUTCH-1588: Attachment: NUTCH-1588.patch I develop for 2.x > Port NUTCH-1245 URL gone with 404 af

[jira] [Commented] (NUTCH-1568) port pluggable indexing architecture to 2.x

2013-10-05 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787191#comment-13787191 ] Talat UYARER commented on NUTCH-1568: - is anybody dealing with this issue? I wan

[jira] [Commented] (NUTCH-1645) Junit Test Case for Adaptive Fetch Schedule class

2013-10-05 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787203#comment-13787203 ] Talat UYARER commented on NUTCH-1645: - This is not Junit Test Yasin. Yes this is

[jira] [Commented] (NUTCH-1568) port pluggable indexing architecture to 2.x

2013-10-05 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13787321#comment-13787321 ] Talat UYARER commented on NUTCH-1568: - You are welcome Lewis. Actually I should

[jira] [Updated] (NUTCH-1588) Port NUTCH-1245 URL gone with 404 after db.fetch.interval.max stays db_unfetched in CrawlDb and is generated over and over again to 2.x

2013-10-07 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Talat UYARER updated NUTCH-1588: Attachment: NUTCH-1588-final.patch I updated coding's style. Thanks for notice, Sebastian

[jira] [Commented] (NUTCH-1568) port pluggable indexing architecture to 2.x

2013-10-10 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13791398#comment-13791398 ] Talat UYARER commented on NUTCH-1568: - I agree with you. Now I develop for def

[jira] [Comment Edited] (NUTCH-1568) port pluggable indexing architecture to 2.x

2013-10-10 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13791398#comment-13791398 ] Talat UYARER edited comment on NUTCH-1568 at 10/10/13 11:2

[jira] [Updated] (NUTCH-1568) port pluggable indexing architecture to 2.x

2013-10-11 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Talat UYARER updated NUTCH-1568: Attachment: NUTCH-1568.patch First version of Pluggable Indexing. It provide indexer plugin with

[jira] [Commented] (NUTCH-1568) port pluggable indexing architecture to 2.x

2013-10-11 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13793279#comment-13793279 ] Talat UYARER commented on NUTCH-1568: - I upload for testing this patch file. I

[jira] [Created] (NUTCH-1655) Indexer Plugin for Elastic Search

2013-10-11 Thread Talat UYARER (JIRA)
Talat UYARER created NUTCH-1655: --- Summary: Indexer Plugin for Elastic Search Key: NUTCH-1655 URL: https://issues.apache.org/jira/browse/NUTCH-1655 Project: Nutch Issue Type: Improvement

[jira] [Updated] (NUTCH-1655) Indexer Plugin for Elastic Search

2013-10-11 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Talat UYARER updated NUTCH-1655: Issue Type: Sub-task (was: Improvement) Parent: NUTCH-1568 > Indexer Plugin for Elas

[jira] [Updated] (NUTCH-1568) port pluggable indexing architecture to 2.x

2013-10-14 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Talat UYARER updated NUTCH-1568: Attachment: NUTCH-1568-v2.patch I cleaned unnecessary comment lines and I apply solr 4.x changes

[jira] [Updated] (NUTCH-1655) Indexer Plugin for Elastic Search

2013-10-14 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Talat UYARER updated NUTCH-1655: Attachment: NUTCH-1655.patch Elasticsearch indexer plugin for new Pluggable indexers. Before apply

[jira] [Updated] (NUTCH-1655) Indexer Plugin for Elastic Search

2013-10-14 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Talat UYARER updated NUTCH-1655: Patch Info: Patch Available > Indexer Plugin for Elastic Sea

[jira] [Commented] (NUTCH-1371) Replace Ivy with Maven Ant tasks

2013-10-16 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13797070#comment-13797070 ] Talat UYARER commented on NUTCH-1371: - I can test Julien. I am intersted in

[jira] [Commented] (NUTCH-1413) Fetcher to record response time

2013-10-27 Thread Talat UYARER (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13806402#comment-13806402 ] Talat UYARER commented on NUTCH-1413: - You are right. It needs a configura

  1   2   3   >