Hi Markus

Thanks for the response.  I did not find any GC issues.  I also increased 
mapred.task.timeout to 10000.  Still I have same issue. 



2025-01-02 14:36:53,481 INFO o.a.n.f.Fetcher [LocalJobRunner Map Task Executor 
#0] -activeThreads=1, spinWaiting=1, fetchQueues.totalSize=1, 
fetchQueues.getQueueCount=1

2025-01-02 14:36:53,481 INFO o.a.n.f.FetchItemQueues [LocalJobRunner Map Task 
Executor #0] * queue: http://www.titck.gov.tr

2025-01-02 14:36:53,481 INFO o.a.n.f.FetchItemQueue [LocalJobRunner Map Task 
Executor #0]   maxThreads    = 1

2025-01-02 14:36:53,481 INFO o.a.n.f.FetchItemQueue [LocalJobRunner Map Task 
Executor #0]   inProgress    = 1

2025-01-02 14:36:53,481 INFO o.a.n.f.FetchItemQueue [LocalJobRunner Map Task 
Executor #0]   crawlDelay    = 5000

2025-01-02 14:36:53,481 INFO o.a.n.f.FetchItemQueue [LocalJobRunner Map Task 
Executor #0]   minCrawlDelay = 0

2025-01-02 14:36:53,481 INFO o.a.n.f.FetchItemQueue [LocalJobRunner Map Task 
Executor #0]   nextFetchTime = 1735828612457

2025-01-02 14:36:53,481 INFO o.a.n.f.FetchItemQueue [LocalJobRunner Map Task 
Executor #0]   now           = 1735828613481

2025-01-02 14:36:53,481 INFO o.a.n.f.FetchItemQueue [LocalJobRunner Map Task 
Executor #0]   0. https://www.titck.gov.tr/

2025-01-02 14:36:53,481 WARN o.a.n.f.Fetcher [LocalJobRunner Map Task Executor 
#0] Aborting with 1 hung threads.

2025-01-02 14:36:53,481 WARN o.a.n.f.Fetcher [LocalJobRunner Map Task Executor 
#0] Thread #0 hung while processing https://www.titck.gov.tr/

2025-01-02 14:36:53,536 WARN o.a.h.m.i.MetricsSystemImpl [pool-55-thread-1] 
JobTracker metrics system already initialized!

2025-01-02 14:36:54,389 INFO o.a.h.m.Job [main]  map 100% reduce 100%

2025-01-02 14:36:54,389 INFO o.a.h.m.Job [main] Job job_local1014979377_0001 
completed successfully

2025-01-02 14:36:54,397 INFO o.a.h.m.Job [main] Counters: 31

File System Counters

FILE: Number of bytes read=1717876

FILE: Number of bytes written=3144478

FILE: Number of read operations=0

FILE: Number of large read operations=0

FILE: Number of write operations=0

Map-Reduce Framework

Map input records=1

Map output records=0

Map output bytes=0

Map output materialized bytes=14

Input split bytes=162

Combine input records=0

Combine output records=0

Reduce input groups=0

Reduce shuffle bytes=14

Reduce input records=0

Reduce output records=0

Spilled Records=0

Shuffled Maps =1

Failed Shuffles=0

Merged Map outputs=1

GC time elapsed (ms)=0

Total committed heap usage (bytes)=4299161600

FetcherStatus

bytes_downloaded=0

Shuffle Errors

BAD_ID=0

CONNECTION=0

IO_ERROR=0

WRONG_LENGTH=0

WRONG_MAP=0

WRONG_REDUCE=0

File Input Format Counters

Bytes Read=182

File Output Format Counters

Bytes Written=564

2025-01-02 14:36:54,397 INFO o.a.n.f.Fetcher [main] Fetcher: finished at 
2025-01-02 14:36:54, elapsed: 00:05:03





Thanks and Regards

Raj Chidara








---- On Thu, 02 Jan 2025 18:10:05 +0530 Markus Jelsma 
<[email protected]> wrote ---



Hi Raj, 
 
I can't seem to find an issue crawling that site, but maybe your parser is 
hanging. It is usually the case when 'hanging' threads are detected. You 
can also increase -Dmapred.task.timeout=, it controls how long it waits 
before giving up on hanging threads. 
 
Also check your logs, there can be a hint in there, such as a GC issue, or 
whatever. 
 
Regards, 
Markus 
 
Op wo 1 jan 2025 om 15:26 schreef Raj Chidara 
<mailto:[email protected]>: 
 
> Hi 
> 
>  I have problem in crawling and fetching this site 
> https://www.titck.gov.tr/ .  It is either crawling same page again and 
> again and some times I get an error that Thread #0 hung while processing 
> https://www.titck.gov.tr/.  Can you please help me out. 
> 
> Thanks and Regards 
> 
> Raj Chidara 
> 
> 
> 
> 
> Global Locations: 
> 
> USA | UK | India | Singapore | Japan 
> 
> *ISO 9001, 27001, 13485 Compliant 
> 
> www.DDIsmart.com 
> 
> About Us | Awards | Blog | News | Contact Us 
> 
> 
> 
> 
> 
> 
> 
> DISCLAIMER: This message is intended solely for the use of the individual 
> or entity to which it is addressed. If you are not the intended recipient, 
> you should not use, copy, alter, or disclose the contents of this message. 
> All information or opinions expressed in this message and/or any 
> attachments are those of the author and are not necessarily those of the 
> group companies. 
> 
> 
> 
> 
>

 
 
 
Global Locations:
 
USA | UK | India | Singapore | Japan
 
*ISO 9001, 27001, 13485 Compliant
 
www.DDIsmart.com
 
About Us | Awards | Blog | News | Contact Us
 
 
 
  
 
 
 
DISCLAIMER: This message is intended solely for the use of the individual or 
entity to which it is addressed. If you are not the intended recipient, you 
should not use, copy, alter, or disclose the contents of this message. All 
information or opinions expressed in this message and/or any attachments are 
those of the author and are not necessarily those of the group companies.
 



Reply via email to