Hello Dears,
When I try to index text files using the below code, I come accross errors
like (Error one4, Error one5, Error one6, Error one3, ....). I tried to
save the file in different formats like(UTF-8, Big Indian, UTF) but the
change is only the number of errors varied.
The text is in Ethiopic(Geez), and I have my own analyzer.
The environment is: Windows 7, Netbeans IDE 7.3.1, and I have included the
necessary jar files. Please help me to avoid these errors.
public void addTextDocument(String htmlPath, IndexWriter Writerindex)
throws Exception{
File file=new File(htmlPath);
FileInputStream input=new FileInputStream(file);
InputStreamReader read=new InputStreamReader(input,"utf-8");
BufferedReader reader=new BufferedReader(read);
StringBuffer buffer=new StringBuffer();
String line=null;
while((line=reader.readLine())!=null)
{ buffer.append(line);}
String content=buffer.toString();
String filename = file.getName();
String url=filename;
Document document = new Document();
if((url!=null)&&(!url.equals("")))
{ document.add(Field.Keyword("url",url));}
if((content!=null) &&(!content.equals("")))
{ document.add(Field.Text("content",content));}
try {
System.out.println("=====================================");
Writerindex.addDocument(document);
System.out.println("=====================================");
} catch (IOException e) {
e.printStackTrace();
}
}