Hi Folks,

There is an issue with protocol-file plugin in while fetching files that
contain CJK characters in the file name. JIRA Nutch 968

After I checked the code, I discovered that the problem due to the encoding
in the file name while fetching the directory. After changing couple of
lines as discussed in the JIRA Nutch 968, the issue is resolved.

I see the issue is still open in JIRA and the latest nutch release has no
fix in it yet. I like to discuss further on the solution I have here in the
list and submit the patch once fine.

Anyone in for it? I can elaborate further more on the fix.

Cheers,

Ye

Reply via email to