Hi Markus Thanks for the response. I did not find any GC issues. I also increased mapred.task.timeout to 10000. Still I have same issue.
2025-01-02 14:36:53,481 INFO o.a.n.f.Fetcher [LocalJobRunner Map Task Executor #0] -activeThreads=1, spinWaiting=1, fetchQueues.totalSize=1, fetchQueues.getQueueCount=1 2025-01-02 14:36:53,481 INFO o.a.n.f.FetchItemQueues [LocalJobRunner Map Task Executor #0] * queue: http://www.titck.gov.tr 2025-01-02 14:36:53,481 INFO o.a.n.f.FetchItemQueue [LocalJobRunner Map Task Executor #0] maxThreads = 1 2025-01-02 14:36:53,481 INFO o.a.n.f.FetchItemQueue [LocalJobRunner Map Task Executor #0] inProgress = 1 2025-01-02 14:36:53,481 INFO o.a.n.f.FetchItemQueue [LocalJobRunner Map Task Executor #0] crawlDelay = 5000 2025-01-02 14:36:53,481 INFO o.a.n.f.FetchItemQueue [LocalJobRunner Map Task Executor #0] minCrawlDelay = 0 2025-01-02 14:36:53,481 INFO o.a.n.f.FetchItemQueue [LocalJobRunner Map Task Executor #0] nextFetchTime = 1735828612457 2025-01-02 14:36:53,481 INFO o.a.n.f.FetchItemQueue [LocalJobRunner Map Task Executor #0] now = 1735828613481 2025-01-02 14:36:53,481 INFO o.a.n.f.FetchItemQueue [LocalJobRunner Map Task Executor #0] 0. https://www.titck.gov.tr/ 2025-01-02 14:36:53,481 WARN o.a.n.f.Fetcher [LocalJobRunner Map Task Executor #0] Aborting with 1 hung threads. 2025-01-02 14:36:53,481 WARN o.a.n.f.Fetcher [LocalJobRunner Map Task Executor #0] Thread #0 hung while processing https://www.titck.gov.tr/ 2025-01-02 14:36:53,536 WARN o.a.h.m.i.MetricsSystemImpl [pool-55-thread-1] JobTracker metrics system already initialized! 2025-01-02 14:36:54,389 INFO o.a.h.m.Job [main] map 100% reduce 100% 2025-01-02 14:36:54,389 INFO o.a.h.m.Job [main] Job job_local1014979377_0001 completed successfully 2025-01-02 14:36:54,397 INFO o.a.h.m.Job [main] Counters: 31 File System Counters FILE: Number of bytes read=1717876 FILE: Number of bytes written=3144478 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 Map-Reduce Framework Map input records=1 Map output records=0 Map output bytes=0 Map output materialized bytes=14 Input split bytes=162 Combine input records=0 Combine output records=0 Reduce input groups=0 Reduce shuffle bytes=14 Reduce input records=0 Reduce output records=0 Spilled Records=0 Shuffled Maps =1 Failed Shuffles=0 Merged Map outputs=1 GC time elapsed (ms)=0 Total committed heap usage (bytes)=4299161600 FetcherStatus bytes_downloaded=0 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=182 File Output Format Counters Bytes Written=564 2025-01-02 14:36:54,397 INFO o.a.n.f.Fetcher [main] Fetcher: finished at 2025-01-02 14:36:54, elapsed: 00:05:03 Thanks and Regards Raj Chidara ---- On Thu, 02 Jan 2025 18:10:05 +0530 Markus Jelsma <[email protected]> wrote --- Hi Raj, I can't seem to find an issue crawling that site, but maybe your parser is hanging. It is usually the case when 'hanging' threads are detected. You can also increase -Dmapred.task.timeout=, it controls how long it waits before giving up on hanging threads. Also check your logs, there can be a hint in there, such as a GC issue, or whatever. Regards, Markus Op wo 1 jan 2025 om 15:26 schreef Raj Chidara <mailto:[email protected]>: > Hi > > I have problem in crawling and fetching this site > https://www.titck.gov.tr/ . It is either crawling same page again and > again and some times I get an error that Thread #0 hung while processing > https://www.titck.gov.tr/. Can you please help me out. > > Thanks and Regards > > Raj Chidara > > > > > Global Locations: > > USA | UK | India | Singapore | Japan > > *ISO 9001, 27001, 13485 Compliant > > www.DDIsmart.com > > About Us | Awards | Blog | News | Contact Us > > > > > > > > DISCLAIMER: This message is intended solely for the use of the individual > or entity to which it is addressed. If you are not the intended recipient, > you should not use, copy, alter, or disclose the contents of this message. > All information or opinions expressed in this message and/or any > attachments are those of the author and are not necessarily those of the > group companies. > > > > > Global Locations: USA | UK | India | Singapore | Japan *ISO 9001, 27001, 13485 Compliant www.DDIsmart.com About Us | Awards | Blog | News | Contact Us DISCLAIMER: This message is intended solely for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you should not use, copy, alter, or disclose the contents of this message. All information or opinions expressed in this message and/or any attachments are those of the author and are not necessarily those of the group companies.

