Hi boyan, Thank you for taking the time to look into this. The suggested method works smoothly!
Best regards, Jeremy On 11 July 2011 17:42, Boyan Kukushev <[email protected]> wrote: > Hi again, > > I forgot something in the code - one should call > test.repository.synchronizeIndex(false) > > after each document is added. > > HTH, > Boyan > > On Monday, July 11, 2011 16:54:24 Boyan Kukushev wrote: > > Hi Jeremy, > > > > Yes, your guess is correct - KIM is doing several checks on the user > query > > and these checks will throw an exception if there is a special character > > in the query. > > > > A simple workaround that would be useful in this case is calculating a > hash > > code of the document content before adding it in the KIM repository. The > > hash code is then added as a KIM document feature in the document's > > feature map. > > > > *Important:* in order to store this value in Owlim and Lucene as a > document > > field, you should add the name of the hash code feature in the > > *com.ontotext.kim.KIMConstants.DOCUMENT_FEAT_LIST* list, > > located in > > *<kim-home>/config/document.repository.properties* configuration file. > > > > You can find attached sample code showing the required API usage. The > code > > uses the apache commons-codec library to produce a hex MD5 hash strings. > > > > If this solution does not work, please provide information about the > > version of KIM you are using along with log files and possible exceptions > > that occurred while executing the sample code. > > > > Hope this helps! > > > > Regards, > > Boyan > > > > On Saturday, July 09, 2011 19:43:16 Jeremy Raes wrote: > > > Dear Philip, > > > > > > Thanks for the fast reply. > > > > > > Attached to this mail, you'll find a file with java code, producing the > > > > > > following error: > > > com.ontotext.kim.client.query.KIMQueryException: Lucene special > > > > > > characters in field name in brackets: [Canalhopper.\(Duur\] > > > > > > at com.ontotext.kim.lucene.LuceneDocumentRepositoryImpl.getDocumentIds( > > > LuceneDocumentRepositoryImpl.java:429) > > > > > > at com.ontotext.kim.coredb.CachingDocumentRepository.getDocumentIds( > > > CachingDocumentRepository.java:91) > > > > > > at com.ontotext.kim.coredb.RdfCore.getDocumentIds(RdfCore.java:266) > > > > > > at com.ontotext.kim.coredb.CachingDocumentRepository.getDocumentIds( > > > CachingDocumentRepository.java:91) > > > > > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > > > > > at sun.reflect.NativeMethodAccessorImpl.invoke( > > > NativeMethodAccessorImpl.java:39) > > > > > > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > > > DelegatingMethodAccessorImpl.java:25) > > > > > > at java.lang.reflect.Method.invoke(Method.java:597) > > > > > > at org.openrdf.util.rmirouting.ChannelIfaceImpl.invoke( > > > ChannelIfaceImpl.java:513) > > > > > > at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) > > > > > > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > > > DelegatingMethodAccessorImpl.java:25) > > > > > > at java.lang.reflect.Method.invoke(Method.java:597) > > > > > > at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305) > > > > > > at sun.rmi.transport.Transport$1.run(Transport.java:159) > > > > > > at java.security.AccessController.doPrivileged(Native Method) > > > > > > at sun.rmi.transport.Transport.serviceCall(Transport.java:155) > > > > > > at > > > > sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535) > > > > > > at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0( > > > TCPTransport.java:790) > > > > > > at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run( > > > TCPTransport.java:649) > > > > > > at java.util.concurrent.ThreadPoolExecutor$Worker.runTask( > > > ThreadPoolExecutor.java:886) > > > > > > at java.util.concurrent.ThreadPoolExecutor$Worker.run( > > > ThreadPoolExecutor.java:908) > > > > > > at java.lang.Thread.run(Thread.java:680) > > > > > > Exception in thread "main" java.lang.NullPointerException > > > > > > at tmp.QueryDoc.itemNotInRepository(QueryDoc.java:47) > > > > > > at tmp.QueryDoc.main(QueryDoc.java:96) > > > > > > This error does not occur whenever I query a String that does not > contain > > > any special characters. > > > > > > Thanks for your help! > > > > > > Best, > > > Jeremy > > > > > > On 9 July 2011 13:23, Philip Alexiev @ Ontotext > > > <[email protected] > > > > > > > wrote: > > > > > > > > Hi Jeremy, > > > > > > > > It is best if you provide a simple standalone class or a test case > that > > > > works with with some test data and will reproduce the problem. That > > > > way we can track exactly what is happening. > > > > > > > > Thank you, > > > > Philip > > > > > > > > On 8 Jul 2011, at 6:08 PM, Jeremy Raes wrote: > > > > > > > > Hey, > > > > > > > > I am building an application upon KIM whereby I need to check if a > > > > document already exist in the repository before deciding on adding > it. > > > > > > > > To do this, I wrote the following code: > > > > > > > > private boolean itemNotInRepository(Item item){ > > > > > > > >> assert(item != null); > > > >> DocumentQuery query = new DocumentQuery(); > > > >> DocumentQueryResult queryResult = null; > > > >> try { > > > >> String escaped = QueryParser.escape(item.getDescription()); > > > >> query.setKeywordRestriction(escaped); > > > >> queryResult = this.apiDR.getDocumentIds(query); > > > >> } catch (KIMQueryException e) { > > > >> e.printStackTrace(); > > > >> } > > > >> return queryResult.isEmpty(); > > > >> } > > > > > > > > Because some of the Strings, returned by item.getDescription(), might > > > > contain special characters [mainly "(" and ")"], I added the String > > > > escaped = QueryParser.escape(item.getDescription()) to my code, but > > > > nonetheless I get a KIMQueryException: > > > > > > > > com.ontotext.kim.client.query.KIMQueryException: Lucene special > > > > characters > > > > > > > >> in field name in brackets: [Canalhopper.\(Duur\] > > > >> at > > > >> com.ontotext.kim.lucene.LuceneDocumentRepositoryImpl.getDocumentIds( > > > >> LuceneDocumentRepositoryImpl.java:429) > > > >> at com.ontotext.kim.coredb.CachingDocumentRepository.getDocumentIds( > > > >> CachingDocumentRepository.java:91) > > > >> at com.ontotext.kim.coredb.RdfCore.getDocumentIds(RdfCore.java:266) > > > >> at com.ontotext.kim.coredb.CachingDocumentRepository.getDocumentIds( > > > >> CachingDocumentRepository.java:91) > > > >> at sun.reflect.GeneratedMethodAccessor86.invoke(Unknown Source) > > > >> at sun.reflect.DelegatingMethodAccessorImpl.invoke( > > > >> DelegatingMethodAccessorImpl.java:25) > > > >> at java.lang.reflect.Method.invoke(Method.java:597) > > > >> at org.openrdf.util.rmirouting.ChannelIfaceImpl.invoke( > > > >> ChannelIfaceImpl.java:513) > > > >> at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) > > > >> at sun.reflect.DelegatingMethodAccessorImpl.invoke( > > > >> DelegatingMethodAccessorImpl.java:25) > > > >> at java.lang.reflect.Method.invoke(Method.java:597) > > > >> at > sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305) > > > >> at sun.rmi.transport.Transport$1.run(Transport.java:159) > > > >> at java.security.AccessController.doPrivileged(Native Method) > > > >> at sun.rmi.transport.Transport.serviceCall(Transport.java:155) > > > >> at sun.rmi.transport.tcp.TCPTransport.handleMessages( > > > >> TCPTransport.java:535) > > > >> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0( > > > >> TCPTransport.java:790) > > > >> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run( > > > >> TCPTransport.java:649) > > > >> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask( > > > >> ThreadPoolExecutor.java:886) > > > >> at java.util.concurrent.ThreadPoolExecutor$Worker.run( > > > >> ThreadPoolExecutor.java:908) > > > >> at java.lang.Thread.run(Thread.java:680) > > > >> Exception in thread "main" java.lang.NullPointerException > > > >> at knowledgeAcquisition.KIMKnowledgeAcquisition.itemNotInRepository( > > > >> KIMKnowledgeAcquisition.java:147) > > > >> at knowledgeAcquisition.KIMKnowledgeAcquisition.execute( > > > >> KIMKnowledgeAcquisition.java:188) > > > >> at run.Main.main(Main.java:21) > > > > > > > > My guess is that KIM pre-checks the query (before processing it with > > > > Lucene) and throws an error when a special character is found -- even > > > > though there is an "\" before the special character. Any suggestions > on > > > > how I can (1) either avoid this error or (2) any other methods to > check > > > > if a document already exists in the document repository? > > > > > > > > > > > > Any help is appreciated. Thanks in advanced! > > > > > > > > > > > > Best regards, > > > > > > > > Jeremy > > > > _______________________________________________ > > > > Kim-discussion mailing list > > > > [email protected] > > > > http://ontotext.com/mailman/listinfo/kim-discussion > > -- > Boyan Kukushev > Senior Software Engineer / Java Developer > Ontotext AD @ Sirma Group Corp. >
_______________________________________________ Kim-discussion mailing list [email protected] http://ontotext.com/mailman/listinfo/kim-discussion
