Hi Rahil, Your out of memory error is likely due to a mysql bug outlined here:
http://bugs.mysql.com/bug.php?id=7698 There is a work around presented in the article. I have been able to select large datasets from mysql while indexing by using the SQL_BIG_RESULT hint in mysql and pumping up the max heap size on the java side via -Xmx 2048M. I am not sure about the IOException except perhaps there is a stale lock file or otherwise corrupted index? Dennis On Friday 19 May 2006 08:55, Rahil wrote: > Thanks Paul and Otis > > I basically applied the same mechanism used in creating indexes in MySQL > to Lucene. So I didnt use any fetchSize. But Ill implement it now and > see how it performs. Will also look into DBSight. > > However when executing the query by limiting the result set to 100000 > the query executed fine but it led to an IOException in the Lucene index > creation. I might be wrong but I think that this IOException has nothing > to do with the MySQL code but rather with the inclusion of the Document > object to the index. Im attaching a bit more of my code below. > > ------- > > The main method has the following method calls : > > //establish a connection with MySQL > Connection conn = lucene.connectMySQL(); > > //run SQL query > ResultSet rs = lucene.executeQuery(conn); > > //build the index > lucene.indexResultSet(rs); > > private ResultSet executeQuery(Connection conn) { > > ResultSet resultSet = null; > System.out.println("Executing query..."); > > try { > String sql = "SELECT CONCEPTID, TERM, DESCRIPTIONTYPE FROM > sct_descriptions_20050731 limit 10000"; > PreparedStatement stmt = conn.prepareStatement(sql); > resultSet = stmt.executeQuery(); > } catch (SQLException e) { > e.printStackTrace(); > } > return resultSet; > } > > private IndexWriter getIndexWriter(boolean create) throws IOException{ > > if(indexWriter == null) > indexWriter = new IndexWriter(asksPath+"index",new > StandardAnalyzer(),create); > > return indexWriter; > } > private void indexResultSet(ResultSet rs) throws IOException{ > > Document lucDoc = null; > > indexWriter = getIndexWriter(true); //problem when I set it to > 'false'. Theres a javacc error which does not appear once I delete all > the files in the 'index' directory and set the flag to 'true' > System.out.println("Starting to Index resultset ..."); > try { > while(rs.next()){ > lucDoc = new Document(); > > lucDoc.add(Field.Keyword("conceptId",rs.getString("CONCEPTID"))); > lucDoc.add(Field.Text("term",rs.getString("TERM"))); > > lucDoc.add(Field.UnIndexed("descriptionType",rs.getString("DESCRIPTIONTYPE" >))); > > indexWriter.addDocument(lucDoc); > } > > rs.close(); > closeIndexWriter(); > > //System.out.println("Completed indexing resultset"); > > } catch (SQLException e) { > e.printStackTrace(); > } > > } > > > -------- > > Thanks for all your help > > Rahil > > [EMAIL PROTECTED] wrote: > >I guess you are executing your SQL and getting the whole result set. There > >are options on the JDBC Statement class that can be used for controlling > >the fetch size - by using these you should be able to limit the amount of > >data returned from the database so you don't get OOM. I haven't used these > >so I am guessing a little. Are you pulling the whole result set into > >memory and then adding it to your index or are you iterating through > > result set adding one entry at a time to your index? The latter would be > > better. There is also something called DBSight (that I know very little > > about) but it seems to do exactly what you are trying to do. > > > >Regards > > > >Paul I. > > > > > > > > Otis Gospodnetic > > <otis_gospodnetic > > @yahoo.com> To > > java-user@lucene.apache.org > > 19/05/2006 15:24 cc > > > > Subject > > Please respond to Re: OutOfMemory and IOException > > [EMAIL PROTECTED] Access Denied errors > > apache.org > > > > > > > > > > > > > > > > > > > >It's impossible to tell from the code you provided, but you are most > > likely just leaking memory/resources somewhere. For example, ResultSet's > > and other DB operations should typically be placed in a try/catch/FINALLY > > block, where the finally block ensures all DB resources are > >closed/released. > > > >Otis > > > >----- Original Message ---- > >From: Rahil <[EMAIL PROTECTED]> > >To: Lucene User Group <java-user@lucene.apache.org> > >Sent: Friday, May 19, 2006 8:27:55 AM > >Subject: OutOfMemory and IOException Access Denied errors > > > > Hi > > > >I am new to Lucene so am perhaps missing something obvious. I have > >included Lucene 1.9.1 in my classpath and am trying to integrate it with > >MySQL. > > > >I have a table which has near a million records in it. According to the > >documentation on Lucene I have read so far, my understanding is that I > >need to (1) make a connection with MySQL then (2) execute the query > >normally in SQL syntax. (3) Then pass the ResultSet to the method to > >create indexes. (4) I can then pass a queryString to the searchIndex() > >custom method to locate the queryString. > > > >PROBLEMS: > > > >(a) The first problem I had was when trying to execute the query on the > >million records table in Step 1. It resulted in an OutOfMemory error due > >to the size of the table. How can I get around this problem so that the > >query executes on the entire table at one time? > > > >As a workaround, I limited the number of results to 100,000 which worked > >fine. > > > >(b) But I then received an IOException when the index was being written > >to the Document object. The Exception stack trace is shown below: > > > >--- > >Exception in thread "main" java.io.IOException: Access is denied > >at java.io.WinNTFileSystem.createFileExclusively(Native Method) > >at java.io.File.createNewFile(File.java:850) > >at org.apache.lucene.store.FSDirectory$1.obtain(FSDirectory.java:324) > >at org.apache.lucene.store.Lock.obtain(Lock.java:92) > >at org.apache.lucene.store.Lock$With.run(Lock.java:147) > >at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:442) > >at > >org.apache.lucene.index.IndexWriter.maybeMergeSegments(IndexWriter.java:40 > >1) > > > >at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:260) > >at org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:244) > >at man.ac.uk.most.LuceneIndex.indexResultSet(LuceneIndex.java:102) --- > >error line in my piece of code ! > >at man.ac.uk.most.LuceneIndex.main(LuceneIndex.java:40) > >at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > >at > >sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java: > >39) > > > >at > >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm > >pl.java:25) > > > >at java.lang.reflect.Method.invoke(Method.java:585) > >at com.intellij.rt.execution.application.AppMain.main(AppMain.java:78) > > > >--- > > > >Line 102 is present in the block of code in my program as such > > > >---- > >while(rs.next()){ > >lucDoc = new Document(); > >lucDoc.add(Field.Keyword("conceptId",rs.getString("CONCEPTID"))); > >lucDoc.add(Field.Text("term",rs.getString("TERM"))); > >lucDoc.add(Field.UnIndexed("descriptionType",rs.getString("DESCRIPTIONTYPE > >"))); > > > > > >indexWriter.addDocument(lucDoc); --- problem line 102 > >} > > > >rs.close(); > >closeIndexWriter(); > > > > > >---- > > > >If I limit Step 1 to execute 10000 records then the program runs fine > >and theres no problem. However I need to index the entire table either > >as a single query or an incremental query. > > > >Can someone please help me with these problems. > > > >Thanks > >Rahil > > > > > > > > > >--------------------------------------------------------------------- > >To unsubscribe, e-mail: [EMAIL PROTECTED] > >For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > > >--------------------------------------------------------------------- > >To unsubscribe, e-mail: [EMAIL PROTECTED] > >For additional commands, e-mail: [EMAIL PROTECTED] > > / --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]